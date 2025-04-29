AI in practice
Maximilian Schreiner

Qwen3 series from Alibaba debuts with benchmark results matching top competitors

Alibaba
Qwen3 series from Alibaba debuts with benchmark results matching top competitors
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Profile
E-Mail
Content
summary Summary

Alibaba launches Qwen3, an open source family of models designed to compete with leading systems.

Ad

Alibaba has released its Qwen3 model series, which achieves benchmark results on par with current top models such as DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro.

The two largest models in the lineup, Qwen3-235B-A22B and Qwen3-30B-A3B (both using a Mixture-of-Experts architecture), match the performance of leading systems in standard tests for coding, mathematics, and general capabilities—often with smaller model sizes. According to benchmark data, these strong results were achieved in reasoning mode, likely using the highest available token budget.

Ad
Ad

The pretraining process for Qwen3 involved 36 trillion tokens—more than Llama 4 Maverick (22T) but less than Llama 4 Scout (40T). The training data includes web content, documents, and custom-generated mathematics and programming datasets. Qwen3 models are released under the Apache 2.0 license, making them freely available.

Qwen 3 is a hybrid open source model

A key feature of Qwen3 is its ability to switch between two reasoning modes. In "Thinking Mode," the model solves tasks with detailed intermediate steps. In "Non-Thinking Mode," it delivers fast, direct answers. Similar approaches are found in other models, including Claude 3.7 and Grok. More complex tasks can benefit from the reasoning function, while the faster mode is designed for routine queries.

Alibaba states that the models support 119 languages and dialects, covering widely spoken languages like English, Chinese, and Arabic, as well as numerous minority languages and regional dialects. Actual model performance will depend on the specific application context.

Published benchmark results indicate a high-performance model series that, by size, currently outpaces competitors such as Meta’s Llama series and DeepSeek. However, this lead may be short-lived: Meta is hosting its first Llamacon today and is expected to introduce a reasoning model based on Llama-4, while DeepSeek is anticipated to release the successor to R1 in the coming weeks.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
AI in practice

o1-mini helps math professor with complex proof, but it's complicated

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Alibaba has introduced Qwen3, a new family of open language models that achieve benchmark results on par with leading systems such as DeepSeek-R1 or Gemini-2.5-Pro.
  • The models, especially the largest Qwen3-235B-A22B and Qwen3-30B-A3B, show comparable performance to top systems in programming, math and general skills tests, but are often smaller in size.
  • Qwen3 offers a switchable mode for detailed problem-solving or quick answers, supports 119 languages, and is freely available under the Apache 2.0 license.
Sources
Qwen3
Max is the managing editor of THE DECODER, bringing his background in philosophy to explore questions of consciousness and whether machines truly think or just pretend to.
Profile
E-Mail
AI research
Update

Alibaba's Qwen2.5-VL-32B matches larger models with just 32B parameters

News, tests and reports about VR, AR and MIXED Reality.
Realms of Flow brings beautiful new VR experiences to Meta Quest Pico 4 Ultra set to power Varonia’s free-roaming VR arcades Reebok launches smart glasses for athletes with ChatGPT powered voice assistant MIXED-NEWS.com
AI research

Alibaba's Qwen2.5 Turbo reads ten novels in just about one minute

AI research

Alibaba's Qwen 2.5 AI models are gunning for Llama 3's crown in latest benchmark

Google News
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Qwen3 series from Alibaba debuts with benchmark results matching top competitors

Bank details

IBAN: DE87 1203 0000 1086 0070 75
Account holder: DEEP CONTENT GbR
Purpose: Support THE DECODER
AI and society

Researchers used AI to manipulate Reddit users, scrapped study after backlash

AI research

OpenAI's o3 is less AGI than originally measured

AI research

So-called reasoning models are more efficient but not more capable than regular LLMs, study finds

Google News