Ad
Skip to content

LongLLaMA pushes the limit of context length in open-source LLMs

THE DECODER
Jul 11, 2023

Researchers have released a preview of LongLLaMA, a large language model capable of handling long contexts up to 256.000 tokens or more. Built on the open-source OpenLLaMA and fine-tuned using the Focused Transformer (FoT) method, it permits some attention layers to access a memory cache of key-value pairs to extend their context length.

According to the researchers, the model retains performance on tasks that don't require long contexts, and can be used as a drop-in replacement for shorter context LLaMA implementations. The team has released their smaller 3B variant under the Apache 2.0 license, with inference code supporting longer contexts on Hugging Face. More information and examples of LongLLaMA can be found on their GitHub repository.

Ad
DEC_D_Incontent-1

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: GitHub