Ad
Skip to content

Jonathan Kemper

Jonathan writes for THE DECODER about how AI tools can improve both work and creative projects.
Read full article about: Ernie 5.0: Baidu's 2.4 trillion parameter model becomes China's best in LMArena

Baidu's new AI model Ernie 5.0, which processes text, images, audio, and video in a unified architecture, is now officially available. According to the LMArena ranking from January 15, 2026, Ernie-5.0-0110 scored 1,460 points, placing 8th globally and 1st among all Chinese models. That puts it on par with OpenAI's slightly older GPT-5.1 (High) and ahead of both Google's Gemini 2.5 Pro and Anthropic's Claude Sonnet 4.5. The next best Chinese model is GLM-4.7 from Zhipu AI. In the math category, Ernie 5.0 came in second worldwide, trailing only GPT 5.2 (High).

LM-Arena-Ranking: Baidu Ernie-5.0-0110 belegt Platz 8 mit 1460 Punkten in Textbenchmarks der Top 10.
The LMArena ranking is determined from numerous anonymous pair comparisons in which users choose the better model answer.

Under the hood, the model uses a mixture-of-experts architecture with around 2.4 trillion parameters - but less than 3 percent of those are active for any given query. For now, the model is only available at ernie.baidu.com. Unlike previous releases, Baidu hasn't published any weights yet, and there's no technical report or detailed documentation available. The company's most recent open release was Ernie-4.5-VL-28B-A3B-Thinking, a model that can manipulate images during its reasoning process - for example, zooming in on text to read it more clearly.

Read full article about: Ollama brings local AI image generation to Mac

Ollama, the popular software for running AI models locally, now supports image generation on macOS. The feature is still experimental, with Windows and Linux support coming later. Two models are available at launch: Z-Image Turbo from Alibaba's Tongyi Lab is a 6-billion-parameter model that creates photorealistic images and can render text in both English and Chinese. The recently released Flux 2 Klein from Black Forest Labs is the German company's fastest image model yet, available in 4B and 9B variants.

Terminal-Fenster zeigt Ollama-Prompt für eine Katze mit "Hello"-Schild und das generierte KI-Bild im Interface.
Terminals such as Ghostty or iTerm2 display previews directly.

The 4B version of Flux 2 Klein runs on standard graphics cards with at least 13 GB VRAM, such as an Nvidia RTX 3090 or 4070. The smaller version is available for commercial use, while the larger version is restricted to non-commercial applications. Generated images save directly to the current directory, and users can tweak image size, step count, and seed values. Additional models and image editing features are planned.

Google's new open TranslateGemma models bring translation for 55 languages to laptops and phones

TranslateGemma shows how targeted training helps Google squeeze more performance out of smaller models: the 12B version translates better than a base model twice its size and runs on a regular laptop. With the growing Gemma family, Google is staking its claim in the race for open AI models.

Read full article about: Deutsche Telekom puts Elevenlabs AI on the phone to handle customer calls

Deutsche Telekom is soon using AI voice agents from Elevenlabs in its customer service. Customers can talk to realistic-sounding AI voices around the clock through the app or by phone. The partnership between Europe's largest telecom company and the AI audio startup goes back a while. Since October 2025, Magenta customers have been able to convert text into podcasts up to 25 times a month for free in the MeinMagenta app. Deutsche Telekom also invested in Elevenlabs' Series C funding round.

According to Elevenlabs' internal data, the AI support agent successfully resolves about 80 percent of user queries, particularly when it comes to specific documentation questions. For more complex issues like troubleshooting or pricing inquiries, though, the system still hits its limits and needs to hand off to human support.

Elevenlabs recently launched a marketplace for licensed voices of famous people like John Wayne and Judy Garland. Last year, the company introduced the Eleven v3 voice model with expanded expression options.

Google's MedGemma 1.5 brings 3D CT and MRI analysis to open-source medical AI

Google has updated its open-source medical AI with MedGemma 1.5, a model capable of analyzing 3D medical scans like CTs and MRIs. The release also includes a specialized speech tool that reportedly outperforms OpenAI’s Whisper in medical dictation tasks, though strict licensing conditions apply for clinical use for both models.