Content
summary Summary

Alibaba Cloud has launched Qwen 2.5, a new generation of AI models that rival leading open-source alternatives like Llama 3.1 in benchmark tests. The suite includes variants for general language tasks, programming, and mathematics.

Ad

The Qwen 2.5 series offers models ranging from 0.5 to 72 billion parameters. Alibaba claims its largest model, Qwen2.5-72B, outperforms competitors such as Llama-3.1-70B and Mistral-Large-V2 on benchmarks like MMLU. Smaller versions, including Qwen2.5-14B and Qwen2.5-32B, reportedly match the performance of larger models like Phi-3.5-MoE-Instruct and Gemma2-27B-IT.

According to Alibaba, the Qwen2.5 models were trained on a dataset of up to 18 trillion tokens and support over 29 languages. They can process up to 128,000 tokens and generate 8,000 tokens.

Qwen2.5-Coder, optimized for programming tasks, reportedly outperforms many larger language models across various programming languages and tasks, despite its smaller size.

Ad
Ad

Qwen2.5-Math builds on the earlier Qwen2-Math, incorporating additional mathematical data, including synthetic data generated by its predecessor. Alibaba reports that Qwen2.5-Math-72B-Instruct surpasses models like GPT-4o, Claude 3.5 Sonnet, and Llama 3.1 405B on math-focused benchmarks such as GSM8K, Math, and MMLU-STEM.

Some models open-source

Most Qwen2.5 models are open-source under the Apache 2.0 license, except for the 3B and 72B variants. Alibaba also offers API access to its most powerful models through Qwen-Plus and Qwen-Turbo.

The company highlights improvements in processing structured data, generating structured output, and adapting to various system prompts. These enhancements aim to simplify the implementation of role-playing games and chatbot configuration.

Qwen 2.5 follows earlier releases like Qwen2 and Qwen2-VL, a multimodal model capable of analyzing images and videos up to 20 minutes long.

Alibaba plans to develop even larger Qwen models in the future, including more multimodal variants with image and audio capabilities. All models are available on GitHub.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Alibaba has introduced Qwen 2.5, a new series of AI models that are optimized for general language, programming, and mathematics. The models are available in sizes ranging from 0.5 to 72 billion parameters.
  • According to Alibaba, the Qwen2.5 models outperform leading open source models such as Llama 3.1 in benchmarks. They have been trained on up to 18 trillion tokens, support over 29 languages and can process up to 128,000 tokens.
  • Most Qwen2.5 models are available as open source under the Apache 2.0 license. Alibaba plans to train even larger models in the future, including multimodal capabilities for image and audio data.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.