Baidu's Ernie 5.1 cuts 94 percent of pre-training costs while competing with top models
Key Points
- Baidu has launched Ernie 5.1, a language model distilled from its larger predecessor Ernie 5.0, making it more resource-efficient while currently topping Chinese AI benchmarks.
- The model uses a four-stage training pipeline with specialized expert models for code, logic, and agent tasks, designed to prevent different capabilities from interfering with each other during the learning process.
- Ernie 5.1 is accessible through Baidu's platforms and integrated into various creative applications, but the model weights remain closed, making independent verification of its reported performance impossible.
Baidu has released Ernie 5.1, a language model built on the pre-training foundation of its predecessor Ernie 5.0 but with roughly a third of the total parameters and about half the active parameters per query.
Pre-training costs came in at just six percent of what comparable models require, according to Baidu. On the Arena Search Leaderboard, Ernie 5.1 scored 1,223 points as of May 9—4th place globally and 1st among Chinese models.

In additional benchmarks, Baidu claims Ernie 5.1 beats DeepSeek-V4-Pro on autonomous AI agent tasks (tau3-bench, SpreadsheetBench-Verified) and comes close to Google's Gemini 3.1 Pro on knowledge and reasoning benchmarks (GPQA, MMLU-Pro). On a tough math benchmark (AIME26), the model with tool access lands just behind Gemini 3.1 Pro. Internal evaluations also show the model matching Western commercial models in creative writing, Baidu says.

Ernie 5.1 is a smaller model based on its predecessor
Baidu built Ernie 5.1 as a smaller sub-model from Ernie 5.0 using an approach the company calls the "Once-For-All elastic training framework." Instead of running a separate, expensive pre-training pass for each model size, the company optimizes an entire family of differently sized models in a single run.

The models share weights but differ in depth, width, and how many specialized expert blocks activate for a given query. Baidu picked what it considers the best configuration from this family for Ernie 5.1, which explains the low pre-training costs, since the heavy compute was already done for Ernie 5.0.
In addition, Baidu rebuilt its reinforcement learning infrastructure from the ground up. The key components—model updates, response generation, and evaluation—traditionally run tightly coupled. Baidu now runs them as separate subsystems that scale independently, coordinated by a central controller. Each component gets the right hardware, and a bottleneck in one step doesn't block the others, the company says.
A persistent challenge in large-model reinforcement learning is drift between training and example generation caused by different computation settings. This can destabilize the whole process. Baidu addresses it with a standardized low-precision computation library, plus a correction mechanism for mixture-of-experts models that cuts drift in half without noticeably slowing things down.

A four-stage pipeline tackles the "seesaw effect"
Baidu uses a four-stage fine-tuning process to address a well-known problem: training multiple skills at once often means gains in one area come at the cost of another. Baidu calls this the "seesaw effect:" coding ability, logic, and creativity end up dragging each other down.
The pipeline starts with standard supervised training on a broad dataset. Stage two trains several specialized expert models in parallel, one each for code, reasoning, and agent tasks, each with its own evaluation signals.

In stage three, a single student model learns from all these teachers simultaneously by generating its own answers and comparing them against the experts' outputs. The final stage adds general reinforcement learning for open-ended dialog and creative tasks. Baidu says this step is necessary because teacher-student distillation tends to produce answers that are too polished and lack variety.
Available on creative platforms, but no open weights
Ernie 5.1 is available through ernie.baidu.com and a playground in Baidu AI Studio. The model will also roll out to more than ten creative platforms, including the role-playing platform Isekai Zero, creative agent Mulan AI, AI canvas app Diting Huanliu, and short drama generator Storymaster.
As with Ernie 5.0, Baidu hasn't released model weights, so the benchmark scores and efficiency claims can't be independently verified.
Baidu laid the groundwork for this leaner release with Ernie 5.0 in January 2026. That model processes text, images, audio, and video in a unified architecture using a mixture-of-experts structure with roughly 2.4 trillion total parameters, fewer than three percent of which activate per query.
AI News Without the Hype – Curated by Humans
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
Subscribe now