Zhipu AI's GLM-4.5 is yet another open-source Chinese LLM closing the gap with Western models

Aug 17, 2025

Sora prompted by THE DECODER

Zhipu AI has released GLM-4.5 and GLM-4.5V, a new family of open-source language models built for logical reasoning, programming, and agent-based tasks.

To showcase the models' practical capabilities, Zhipu AI points to examples like generating interactive mini-games and physics simulations, producing presentation slides with autonomous web search, and developing full web applications with integrated frontends and backends.

Browser screenshot of a Flappy Bird clone at chat.z.ai with a score of 6 and a best score of 2. — This playable demo was generated from a single prompt: "Write a Flappy Bird game for me in a single HTML page. Keep the gravity weak so that the game is not too hard." | Image: Screenshot by THE DECODER

The multimodal version, GLM-4.5V, adds image and video analysis, can reconstruct websites from screenshots, and perform screen operations for autonomous agents. Users can try these features for free in a ChatGPT-style interface at chat.z.ai after logging in.

Chat window: User asks about video optimization; GLM-4.5V provides tips on 4K export, noise reduction, rule of thirds, dolly shots, and angle selection. — chat.z.ai can analyze text, images, and even videos. | Image: Screenshot by THE DECODER

The lineup includes three models: the standard GLM-4.5, the lighter GLM-4.5-Air, and the multimodal GLM-4.5V. Each model offers a hybrid approach with two modes: "think mode" for complex reasoning and "quick response mode" for faster answers.

Strong results with fewer parameters

Zhipu AI claims GLM-4.5V delivers the strongest performance among open-source models of a similar size. In tests across twelve benchmarks, GLM-4.5 ranked third overall and second for autonomous tasks. It scored 70.1 percent on TAU-Bench agent tasks, 91.0 percent on AIME 24 math problems, and 64.2 percent on SWE-Bench Verified software engineering tasks.

Bar chart: Performance of 13 LLMs on 12 benchmarks (agentic, reasoning, coding); GLM-4.5 ranked 3rd (63.2), GLM-4.5-Air ranked 6th (59.8) — GLM-4.5 placed third out of 13 large language models across 12 agent, reasoning, and coding benchmarks with 63.2 points; the resource-efficient GLM-4.5-Air took sixth place with 59.8 points. | Image: Zhipu AI

Parameter efficiency stands out: GLM-4.5 uses just half as many parameters as Deepseek-R1 and a third of what Kimi K2 requires, yet matches or beats their performance. For web navigation, GLM-4.5 reaches 26.4 percent on BrowseComp, surpassing even the much larger Claude Opus 4, which scores 18.8 percent.

Scatter plot of model parameters (B) vs. SWE bench scores with GLM-4.5 and GLM-4.5-Air on the Pareto frontier. — Even the smaller Air model matches Deepseek R1 for coding tasks, despite using far fewer parameters. | Image: Zhipu AI

Deeper architecture for better reasoning

GLM-4.5 uses a mixture-of-experts architecture with a total of 355 billion parameters and 32 billion active at any time. The compact GLM-4.5-Air has 106 billion parameters, with 12 billion active. GLM-4.5V builds on the Air version.

Unlike models such as Deepseek-V3 and Kimi K2, Zhipu AI favors deeper networks with more layers rather than wider ones with more parameters per layer. Their research found that increasing depth boosted reasoning abilities. Training covered around 23 trillion tokens in multiple phases, starting with general data and progressing to specialized code and reasoning tasks.

Zhipu's rise to a billion-dollar valuation

All models are available through the Z.ai platform with OpenAI-compatible API endpoints. The code is open source on Github, and model weights can be downloaded from Hugging Face and Alibaba's Modelscope.

Zhipu AI first attracted attention in 2022, when its GLM-130B model outperformed offerings from Google and OpenAI. Founded in 2019 by professors from Tsinghua University and based in Beijing, the company now employs more than 800 people, most of whom work in research and development.

Major investors include Chinese tech giants like Alibaba, Tencent, and Xiaomi and several sovereign wealth funds. International backers such as Saudi Aramco's Prosperity7 Ventures have also joined in, and the company is now valued at over $5 billion. Like Deepseek, Zhipu AI is known for its strong academic team and independent research and is currently preparing for an IPO.

All Chinese AI models are subject to government censorship, reflecting the priorities and ideology of the Chinese administration. Meanwhile, the US government under Trump is pushing for its own restrictions on US AI models, driven by a different set of political values. In both cases, these models risk becoming tools for state propaganda and the broader culture wars—different ideologies, but ultimately similar forms of censorship that shape how AI systems are used and what they can say.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder