AI in practice
Alibaba's Qwen introduces new models for voice, image editing and safety

Alibaba
Alibaba's Qwen introduces new models for voice, image editing and safety
Alibaba's Qwen AI group has rolled out several new models and updates.

The new Qwen3-TTS-Flash model generates natural-sounding speech in ten languages, including Chinese, English, Italian, and French. It offers 17 different voices, supports nine Chinese dialects, and delivers audio output in just 97 milliseconds, according to Alibaba.

For image editing, Qwen Image Edit 2509 brings improvements in handling faces, products, and text, with better consistency. The model can process multiple input images at once and works with control maps like depth or edge maps. You can try out the new version in Qwen Chat.

The updated Qwen image editing model can merge multiple source images into a single new image. Test it here. | Image: Alibaba

Qwen is also introducing Qwen3Guard, a new content moderation model that comes in three sizes (0.6B, 4B, 8B) and evaluates content in 119 languages. Qwen3Guard detects problematic material in real time (Qwen3Guard stream) or in the overall context (Qwen3Guard gene), classifying it as safe, controversial, or unsafe.

Other updates include a faster version of Qwen3-Next and a new multimodal model, Qwen3-Omni.

