Alibaba's Qwen introduces new models for voice, image editing and safety

Sep 23, 2025

Alibaba

Alibaba's Qwen AI group has rolled out several new models and updates.

The new Qwen3-TTS-Flash model generates natural-sounding speech in ten languages, including Chinese, English, Italian, and French. It offers 17 different voices, supports nine Chinese dialects, and delivers audio output in just 97 milliseconds, according to Alibaba.

For image editing, Qwen Image Edit 2509 brings improvements in handling faces, products, and text, with better consistency. The model can process multiple input images at once and works with control maps like depth or edge maps. You can try out the new version in Qwen Chat.

The updated Qwen image editing model can merge multiple source images into a single new image. Test it here. | Image: Alibaba

Qwen is also introducing Qwen3Guard, a new content moderation model that comes in three sizes (0.6B, 4B, 8B) and evaluates content in 119 languages. Qwen3Guard detects problematic material in real time (Qwen3Guard stream) or in the overall context (Qwen3Guard gene), classifying it as safe, controversial, or unsafe.

Other updates include a faster version of Qwen3-Next and a new multimodal model, Qwen3-Omni.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

More than 16% discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

Alibaba's Qwen introduces new models for voice, image editing and safety

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.