Alibaba's QwQ model takes on OpenAI o1 with enhanced reasoning capabilities

Nov 28, 2024

Qwen/Screenshot

Alibaba has released QwQ-32B-Preview, a new AI model that focuses on logical reasoning and problem-solving capabilities. The model appears to match and sometimes outperform OpenAI's latest offerings in specific areas.

The Chinese tech giant's AI team, Qwen, says their new language model contains 32.5 billion parameters and can process up to 32,000 words of context. QwQ-32B-Preview shows particularly strong results in mathematical tests like AIME and MATH, with notable performance in the MATH-500 and GPQA benchmarks.

Comparison table: Performance benchmarks of six AI language models in four categories (GPQA, AIME, MATH-500, LiveCodeBench) with percentages. — QwQ matches and sometimes exceeds OpenAI's o1-preview in logic benchmarks. | Image: Qwen

Self-checking capabilities

Like OpenAI's o1 models, QwQ incorporates a self-verification system. It pre-plans its answers and double-checks its work, a process that adds to processing time but also boosts accuracy compared to typical language models. The Qwen team waxes philosophical about this feature:

QwQ embodies that ancient philosophical spirit: it knows that it knows nothing, and that’s precisely what drives its curiosity. Before settling on any answer, it turns inward, questioning its own assumptions, exploring different paths of thought, always seeking deeper truth. Yet, like all seekers of wisdom, QwQ has its limitations. This version is but an early step on a longer journey - a student still learning to walk the path of reasoning. Its thoughts sometimes wander, its answers aren’t always complete, and its wisdom is still growing. But isn’t that the beauty of true learning? To be both capable and humble, knowledgeable yet always questioning?
Ad
DEC_D_Incontent-1

Qwen research team

The researchers acknowledge some shortcomings. QwQ can sometimes switch languages unexpectedly, get stuck in loops, and stumble over common-sense reasoning—common pitfalls for logic-focused language models.

Released under the Apache 2.0 license, QwQ is available for commercial use. However, Alibaba has only released certain components, making full replication impossible for now. A demo is available on Hugging Face.

Alibaba's cloud computing unit introduced the first Qwen models in August 2023. Qwen2, a more powerful successor, followed soon after, with improvements in programming, math, logic, and multilingual capabilities.

The current Qwen 2.5 series includes specialized versions: Qwen2.5 for general language, Qwen2.5-Coder for programming, and Qwen2.5-Math. Qwen2.5-Turbo, designed for larger context windows, was added recently.

China's Growing AI Presence

QwQ is the second "reasoning model" to come out of China. DeepSeek recently unveiled a similar system that also appears to challenge OpenAI's offerings. While both are currently only available as "mini" or preview versions, full releases could come later this year.

The arrival of these two Chinese models just weeks after OpenAI's o1 introduction raises interesting questions about OpenAI's competitive edge. However, the full capabilities of OpenAI's o1 model remain undisclosed, particularly regarding the potential of compute scaling. There might be more to these models than meets the eye, and architectural differences could still give OpenAI a distinct advantage.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

Over 20 percent launch discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

Alibaba's QwQ model takes on OpenAI o1 with enhanced reasoning capabilities

Self-checking capabilities

China's Growing AI Presence

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.