Meta's ambitious new SeamlessM4T aims to become your personal universal translator

Meta releases a new AI model capable of directly translating speech in 35 languages and text in 100 languages.

With the new multimodal encoder-decoder model "SeamlessM4T", Meta combines technologies from its long-standing AI translation projects No Language Left Behind (NLLB), Universal Speech Translator and Massively Multilingual Speech into a single model. M4T stands for Massively Multilingual & Multimodal Machine Translation.

According to Meta, by implementing the various previous models in one system, it reduces errors and delays, improving the efficiency and quality of the translation process.

The model is multimodal in that it can translate text into 100 languages in addition to spoken language, which is audio in 35 languages. In total, the model can translate speech-to-text, speech-to-speech, text-to-speech, and text-to-text, as well as automatically recognize speech.

According to Meta, SeamlessM4T is the first model that can translate many languages (35) directly back into spoken language without going through the detour of text translation. The model is supposed to be a "significant step" on the way to a universal translator like the Babel fish in The Hitchhiker's Guide to the Galaxy, which Meta explicitly mentions as a goal in the announcement.

Video: Meta

AI could help Meta to overcome language barriers on its social platforms

According to Meta, SeamlessM4T achieves new state-of-the-art results on major translation benchmarks, outperforming OpenAI's Whisper. If you want to see for yourself, you can try an interactive demo here.

The largest model, SeamlessM4T-Large, is also ahead of Whisper in human evaluation, but the gap is smaller than in the automatic benchmarks. Both models still lag human translations in terms of quality, but the gap narrows with each new model.

Meta is releasing the model under the CC BY-NC 4.0 license as an open-source model on Github, but you cannot use it commercially. According to Meta CEO Mark Zuckerberg, it will be integrated into the company's own social platforms Facebook, Instagram, WhatsApp, Messenger and Threads in the future.

Recommendation

AI research

Humans might need a permission slip to use the internet soon, thanks to AI

In addition to the model, Meta is also releasing the "SeamlessAlign" dataset that the team compiled to train SeamlessM4T. Meta says it is the largest open dataset for multimodal translation, with 470,000 hours of material for 37 languages. Expansion to 100 languages is a topic for future development. This would be the next step towards a universal translator.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Meta's ambitious new SeamlessM4T aims to become your personal universal translator

AI could help Meta to overcome language barriers on its social platforms

Humans might need a permission slip to use the internet soon, thanks to AI

Meta's AI lab turns 10 with three new AI projects and an impressive demo

Fine-tuned Meta Code Llama outperforms GPT-4 in key benchmark

Meta shows extremely fast learning algorithm for speech, image and text

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Meta's ambitious new SeamlessM4T aims to become your personal universal translator

AI could help Meta to overcome language barriers on its social platforms

Share

Bank details