NotebookLM’s "Audio Overviews" feature is now available in approximately 75 languages, including less commonly spoken ones such as Icelandic, Basque, and Latin. The audio for each language is generated by AI agents using "metaprompting," with the Gemini 2.5 Pro language model as the underlying system. At the same time, Google is moving to an audio production technology based entirely on Gemini’s multimodality, a development that does not bode well for providers focused exclusively on audio models.
As with AI-generated text, audio created by language models can also contain inaccuracies. This issue is especially pronounced in AI-generated podcasts, where large amounts of audio may be produced from minimal source text, and the conversion from text to dialogue constitutes a significant alteration of the original material.