Content
summary Summary

Answer.AI and LightOn announced ModernBERT, a new open-source language model that improves upon Google's BERT in speed, efficiency, and quality.

Ad

The encoder-only model processes text up to four times faster than its predecessor while using less memory, according to a blog post from the developers. The team trained ModernBERT on 2 trillion tokens from web documents, programming code, and scientific articles.

ModernBERT can handle texts up to 8,192 tokens long—16 times more than the typical 512-token limit of existing encoder models. It's also the first encoder model trained extensively on programming code. The model scored above 80 on the StackOverflow Q&A dataset, setting a record for encoder-only models.

Scatter plot: Pareto efficiency analysis of language models, X-axis shows runtime (ms/token), Y-axis GLUE score (84-91), Pareto front marked in yellow.
When measuring the balance between processing speed and accuracy in the General Language Understanding Evaluation (GLUE), ModernBERT-Large achieves optimal results with about 20ms per token and 90 points. | Image: Answer.AI, LightOn

The developers liken ModernBERT to a Honda Civic tuned for the racetrack: "When you get on the highway, you generally don’t go and trade in your car for a race car, but rather hope that your everyday reliable ride can comfortably hit the speed limit."

Ad
Ad

Major cost reductions for large-scale text processing

While large language models such as GPT-4 charge several cents per query and take seconds to respond, ModernBERT runs locally and is much faster and cheaper, according to the developers.

For example, filtering 15 trillion tokens in the FineWeb Edu project cost $60,000 using a BERT-based model. The same task would have cost over $1 million even with Google Gemini Flash, the cheapest decoder-based option.

The developers say ModernBERT is well-suited for many real-world applications, from retrieval-augmented generation (RAG) systems to code search and content moderation. Unlike GPT-4, which needs specialized hardware, the model runs effectively on consumer-grade gaming GPUs.

ModernBERT is available in two versions: a base model with 139 million parameters and a large version with 395 million parameters. Both models are now on Hugging Face with an Apache 2.0 license, and users can drop them in as direct replacements for their current BERT models. The team plans to release a larger version next year but has no plans for multimodal capabilities.

To promote development of new applications, the developers launched a competition that will award $100 and a six-month Hugging Face Pro subscription to each of the five best demos.

Recommendation

Google introduced BERT (Bidirectional Encoder Representations from Transformers) in 2018, using it primarily for Google Search. The model remains one of the most popular on HuggingFace, with more than 68 million monthly downloads.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Hugging Face, Lighton, and Answer.AI have jointly developed ModernBERT, an open-source successor to Google's BERT model that promises better performance and efficiency while requiring fewer resources.
  • ModernBERT brings significant performance gains over its predecessor, processing text up to four times faster than the original BERT model. The new system can handle longer inputs of up to 8,192 tokens while using less memory, making it more practical for everyday applications.
  • The model shows particular strength in handling programming code, becoming the first encoder-only model to score above 80 on the StackOverflow QA dataset.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.