BitNet: Microsoft shows how to put AI models on a diet

Midjourney prompted by THE DECODER

BitNet b1.58 2B4T is a new language model from Microsoft designed to operate with minimal energy and memory usage.

Unlike conventional language models that rely on 16- or 32-bit floating point numbers, BitNet uses just 1.58 bits per weight. This reduction significantly lowers memory requirements, cuts energy consumption, and improves response times—particularly on devices with limited computational resources. The model builds on earlier work from the BitNet team.

Modifying the transformer architecture for efficiency

Although BitNet is based on the standard transformer architecture, it incorporates several modifications aimed at greater efficiency. For instance, the developers replaced traditional computational components with so-called BitLinear layers, which rely on simplified numerical representations. Activation functions were also reduced to 8-bit values. Despite these reductions, BitNet reportedly performs comparably to models that are two to three times larger.

The model was trained on four trillion words drawn from public web content, educational materials, and synthetic math problems. It was subsequently fine-tuned with specialized dialogue datasets and optimized to produce responses that are both helpful and safe.

Assessing BitNet b1.58 2B4T for local deployment

In benchmark tests, BitNet outperformed other compact models and performed competitively with significantly larger and less efficient systems. With a memory footprint of only 0.4 gigabytes, the model is suitable for deployment on laptops or in cloud environments. Compared to models that have been simplified post hoc—such as those using INT4 quantization—BitNet demonstrates a stronger balance of performance and efficiency.

To facilitate adoption, Microsoft has released dedicated inference tools for both GPU and CPU execution, including a lightweight C++ version. Future development plans include expanding the model to support longer texts, additional languages, and multimodal inputs such as images. Microsoft is also working on another efficient model family under the Phi series.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

BitNet: Microsoft shows how to put AI models on a diet

Modifying the transformer architecture for efficiency

Assessing BitNet b1.58 2B4T for local deployment

Microsoft launches its first in-house image generation model, MAI-Image-1

Microsoft’s AI boss warns the illusion of conscious AI could trigger psychosis

Microsoft unveils Project Ire, an AI system that automatically detects malware

The long-predicted deepfake dystopia has arrived with Sora 2

Anthropic claims to lower the entry barrier for advanced AI models with Claude Haiku 4.5

OpenAI says GPT-5 shows 30 percent less political bias than previous models

BitNet: Microsoft shows how to put AI models on a diet

Modifying the transformer architecture for efficiency

Assessing BitNet b1.58 2B4T for local deployment

Microsoft launches its first in-house image generation model, MAI-Image-1

Microsoft’s AI boss warns the illusion of conscious AI could trigger psychosis

Microsoft unveils Project Ire, an AI system that automatically detects malware