Alibaba Cloud has launched Qwen 2.5, a new generation of AI models that rival leading open-source alternatives like Llama 3.1 in benchmark tests. The suite includes variants for general language tasks, programming, and mathematics.
The Qwen 2.5 series offers models ranging from 0.5 to 72 billion parameters. Alibaba claims its largest model, Qwen2.5-72B, outperforms competitors such as Llama-3.1-70B and Mistral-Large-V2 on benchmarks like MMLU. Smaller versions, including Qwen2.5-14B and Qwen2.5-32B, reportedly match the performance of larger models like Phi-3.5-MoE-Instruct and Gemma2-27B-IT.
According to Alibaba, the Qwen2.5 models were trained on a dataset of up to 18 trillion tokens and support over 29 languages. They can process up to 128,000 tokens and generate 8,000 tokens.
Qwen2.5-Coder, optimized for programming tasks, reportedly outperforms many larger language models across various programming languages and tasks, despite its smaller size.
Qwen2.5-Math builds on the earlier Qwen2-Math, incorporating additional mathematical data, including synthetic data generated by its predecessor. Alibaba reports that Qwen2.5-Math-72B-Instruct surpasses models like GPT-4o, Claude 3.5 Sonnet, and Llama 3.1 405B on math-focused benchmarks such as GSM8K, Math, and MMLU-STEM.
Some models open-source
Most Qwen2.5 models are open-source under the Apache 2.0 license, except for the 3B and 72B variants. Alibaba also offers API access to its most powerful models through Qwen-Plus and Qwen-Turbo.
The company highlights improvements in processing structured data, generating structured output, and adapting to various system prompts. These enhancements aim to simplify the implementation of role-playing games and chatbot configuration.
Qwen 2.5 follows earlier releases like Qwen2 and Qwen2-VL, a multimodal model capable of analyzing images and videos up to 20 minutes long.
Alibaba plans to develop even larger Qwen models in the future, including more multimodal variants with image and audio capabilities. All models are available on GitHub.