Content
summary Summary

The AI company Magic AI has developed a language model specialized in code with a context window of 100 million tokens.

Ad

Magic AI has developed a new language model called LTM-2-mini that can work with a context window of 100 million tokens, equivalent to about 10 million lines of code or 750 novels. This far exceeds previous limits and could fundamentally change how AI models operate.

Until now, most models have focused on training and work with relatively short contexts during inference. Google's Gemini model series is one exception, having demonstrated interesting use cases with up to 2 million or even 10 million tokens of context in tests.

Magic AI is focusing this technology on software development. A model with access to a project's entire code, documentation, and libraries could greatly improve code generation, according to the company.

Ad
Ad

HashHop replaces "Needle in a Haystack"

To evaluate models with long context windows, Magic AI developed a new benchmark called HashHop, designed to avoid the weaknesses of previous methods like "Needle in a Haystack." HashHop uses hashes, which are random and non-compressible. The model is trained with hash pairs and must then complete the value of a randomly selected pair, requiring it to store and retrieve the maximum information content for a given context size.

A more advanced version of HashHop asks the model to skip steps, like jumping from hash 1 to hash 6. This tests the model architecture's ability to jump over and access multiple points of the entire context in the latent space in one step. HashHop eliminates implicit and explicit semantic cues that previously allowed traditional recurrent neural networks (RNNs) and the recently popularized state space models (SSMs) to perform well, according to Magic AI.

LTM-2-mini is 1000 times more efficient than Llama-3

The LTM-2-mini algorithm for processing a 100 million token context is about 1000 times more efficient than the attention mechanism of Llama 3.1 405B, with significantly lower memory requirements. Magic AI is already working on a larger LTM-2 model and building new supercomputers in collaboration with Google Cloud and Nvidia. The system with Nvidia's Blackwell GB200 NLV72 chips should significantly improve training and inference efficiency, according to Magic CEO Eric Steinberger.

Magic AI recently raised $320 million from investors including Eric Schmidt, Jane Street, and Sequoia, bringing the company's total funding to $465 million.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Magic AI has developed a new language model called LTM-2-mini that can work with a context window of 100 million tokens. This is equivalent to about 10 million lines of code and significantly exceeds previous limits.
  • The company has introduced a new benchmark called HashHop, which is designed to better evaluate the capabilities of models with large context windows than previous methods such as Needle in a Haystack.
  • According to Magic AI, LTM-2-mini's algorithm for processing a context of 100 million tokens is about 1000 times more efficient than Llama 3.1 405B's attention mechanism. The company is already working on a larger LTM-2 model and recently raised $320 million from investors.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.