Alibaba's Qwen group has released two new small-scale multimodal models, Qwen3-VL-30B-A3B-Instruct and Qwen3-VL-30B-A3B-Thinking, each with 3 billion active parameters. According to Qwen, both versions are competitive with GPT-5-Mini and Claude 4 Sonnet, and in some benchmarks show stronger performance in math, image recognition, text recognition, video processing, and agent control.
Ad
The lineup includes an FP8 version for faster inference, and an FP8 variant of the Qwen3-VL-235B-A22B model. The models are available on HuggingFace, ModelScope, and GitHub, or via an Alibaba Cloud API. There is also a web chat interface for direct use.
Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.