Alibaba's open Qwen 3.5 takes aim at GPT-5 mini and Claude Sonnet 4.5 at a fraction of the cost

Matthias Bastian
Feb 26, 2026

Alibaba has released its new Qwen 3.5 model series. The lineup includes four models: Qwen3.5-Flash, Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, and Qwen3.5-27B. According to Alibaba, the models deliver stronger performance while using less computing power. All four take text, images, and video as input and generate text as output.

The smaller Qwen3.5-35B-A3B model outperforms its much larger predecessor, Qwen3-235B-A22B; a clear sign that better architecture, data quality, and reinforcement learning matter more than raw model size. The larger 122B and 27B variants aim to close the remaining gap to top-tier models, particularly in complex agent scenarios.

Benchmarks show Alibaba's Qwen 3.5 models matching or outperforming top Western models like OpenAI's GPT-5 mini, gpt-oss-120b, and Anthropic's Claude Sonnet 4.5. The largest model, Qwen3.5-122B-A10B, leads in several tests: it tops all competitors in agent-based tool use (BFCL V4, 72.2) and agent-based web search (BrowseComp, 63.8). In the HMMT math benchmark, it scores 91.4 - just behind GPT-5 mini (92.0). It also takes the lead in visual reasoning (MMMU-Pro, 76.9) and document recognition (OmniDocBench, 89.8). Claude Sonnet 4.5, on the other hand, clearly outperforms all Qwen models in agent-based terminal coding (49.4) and embodied reasoning (64.7). GPT-5 mini leads in multilingual knowledge (MMMLU, 90.0) and math. Notably, the small Qwen3.5-35B-A3B with just 3 billion active parameters keeps up with much larger models across many tests.
Alibaba's Qwen 3.5 models match or outperform leading Western models like OpenAI's GPT-5 mini, gpt-oss-120b, and Anthropic's Claude Sonnet 4.5 across multiple benchmarks. | Image: Alibaba

All models are available on Hugging Face, ModelScope, and through Qwen Chat. They ship under the Apache License 2.0, a permissive open-source license that allows commercial use, modification, and redistribution. Qwen3.5-Flash is the hosted production version with a context length of one million tokens and built-in tools. The API costs $0.10 per million input tokens and $0.40 per million output tokens.

