Content
summary Summary

Israeli AI company AI21 Labs introduces Jamba, a hybrid language model that combines Transformer and structured state space modeling (SSM).

According to the company's announcement, Jamba is the first production-ready model based on a combination of the Transformer architecture and the Mamba SSM architecture. This hybrid approach is designed to surpass existing models in terms of efficiency and context window while maintaining high output quality.

Image: AI21labs

Architectural innovation for more LLM efficiency

Researchers at Carnegie Mellon University and Princeton University developed the Mamba architecture. It specifically optimizes memory usage and processing speed, which decrease significantly with increasing context length in pure Transformer models.

However, pure SSM models do not achieve the same output quality as the best Transformer models, especially for tasks that require good memory. To address this issue, AI21 Labs combines both approaches in the Jamba architecture and adds Mixture-of-Experts (MoE) layers.

Ad
Ad

According to AI21 Labs, Jamba is the first hybrid SSM Transformer scaled to production size. It offers a context window of 256,000 tokens and, in initial tests, achieves a processing speed three times faster for long contexts than the Mixtral 8x7B Transformer. Jamba processes approximately 1,600 tokens per second, while Mixtral manages around 550.

Image: AI21labs

Focus on efficiency with high output quality

Jamba uses 12 billion of its 52 billion parameters for inference more efficiently than a Transformer-only model of comparable size. According to AI21labs, the additional parameters improve the model's performance without a proportional increase in computational complexity. Jamba is currently the only model in its size class that can process up to 140,000 token contexts on a single 80GB high-end GPU.

Image: AI21labs

In benchmarks, Jamba performs on par with Mixtral8x7B while offering the aforementioned advantages in speed and efficiency.

Image: AI21labs

AI21 makes the weights of the Jamba model available under the Apache 2.0 open-source license and invites developers to experiment with the model and develop it further. An Instruct version of Jamba will also be available soon as a beta through the AI21 Labs platform.

Additionally, Jamba is offered through Nvidia's API catalog, allowing developers of enterprise applications to deploy Jamba via the Nvidia NIM Inference Microservice.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Israeli AI company AI21 Labs presents Jamba, a new AI language model based on a hybrid architecture of Transformer and Structured State Space Modeling (SSM) that is said to outperform existing models in terms of efficiency and context window.
  • In benchmarks, it performs on par with the open-source favorite Mixtral8x7B, but is more efficient. AI21 Labs provides the weights of the Jamba model under the
  • Apache 2.0 open-source license and announces an Instruct version for its own platform as well as availability via Nvidia's API catalog.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.