Why Alexa still can't carry on fluent dialogues despite significant AI progress

Setting the alarm clock, asking about the weather - Alexa understands simple commands without problems. But beyond that, things get tricky. Why is that?

Compared to large language models (LLMs) like GPT-3, voice assistants like Alexa and Google Assistant are pretty tight-lipped. Real conversations do not take place, the systems only understand trivial commands right away and turn them into an action.

The latest research already enables chatbots with more eloquence, so why not advanced voice assistants? AI researcher Gary Marcus, author of The Road to AI We Can Trust newsletter, explores the question in this month's issue.

Marcus rules out fairly obvious reasons from the outset. Has perhaps no one at Amazon been following the latest scientific findings? Probably not - after all, LLMs have long been used for the powerful product recommendation engine.

The unwillingness to invest in licensing costs is also unlikely, as the company could easily provide the infrastructure itself with its Amazon Web Services. Moreover, Amazon has sufficient experience in scaling such systems.

Alexa powered by a large voice model could lead to Amazon losing control

Marcus formulates five reasons that contribute to Alexa not being able (or rather: allowed) to hold conversations, even though it would be technically possible. They ultimately boil down to one central point: LLMs are not yet reliable enough for broad and automated commercial use.

LLMs are unreliable, according to Marcus,
they are unruly,
Amazon does not want to make itself vulnerable,
customers shouldn't have unrealizable expectations,
and LLMs are made for words, not actions.

Amazon would rather sell its customers a product that reliably performs a limited range of tasks. Language models, on the other hand, are unpredictable and difficult to control, Marcus writes.

Moreover, while GPT-3 can generate a string of connected words, it can't yet reliably link them to actions. Startups such as Adept and Google's SayCan are working on this.

Amazon lays off employees for AI and conversation

At the moment, it doesn't look like Alexa will be making leaps and bounds in the foreseeable future. A few days ago, Amazon announced that it was laying off thousands of employees amid the crisis in big tech stocks.

Recommendation

AI in practice

OpenAI launches GPT-4.1: New model family to improve agents, long contexts and coding

Employees in AI systems, natural language processing (NLP) and conversational skills were particularly affected. This could be an indication that Amazon is scaling back its Alexa efforts, or at least not pushing them at the moment. According to a media report, Alexa hardware development alone is said to have brought Amazon a loss of ten billion US dollars this year. No other area at Amazon causes such high losses.

Perhaps Google, which has been researching NLP extensively in recent years and is betting on Google Assistant as its next interface, will do better. By early 2023, Google Assistant should be able to overcome natural pauses in speech and other stumbling blocks in understanding human voice commands.

In addition, Google is currently rolling out LaMDA, its advanced conversational AI, in a test environment. LaMDA could be the basis for a next-gen Assistant and a new form of Internet search - provided Google gets a handle on what Marcus calls unruly LLMs.

The fact that Google is only rolling out LaMDA step by step and has been intensively testing it internally for months is directly related to the points of criticism mentioned by Marcus: It is about security and reliability.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

In this context, for example, prejudices, racism, or aspects that are difficult to predict, such as the pretense of awareness that ex-Google employee Blake Lemoine fell for, play a role. Google's sister company Deepmind recently unveiled a dialog AI optimized for security.

Why Alexa still can't carry on fluent dialogues despite significant AI progress

Alexa powered by a large voice model could lead to Amazon losing control

Amazon lays off employees for AI and conversation

OpenAI launches GPT-4.1: New model family to improve agents, long contexts and coding

Google DeepMind open-sources AI text watermarking for Gemini

Microsoft's RUBICON tells if your AI coding buddy is actually helping or just slacking off

Language models like GPT-4 memorize more than they reason, study finds

US Copyright Office says fair use does not cover AI trained on "vast troves of copyrighted works

US think tank warns of "reverse brain drain" in China's AI sector

Researchers used AI to manipulate Reddit users, scrapped study after backlash

Why Alexa still can't carry on fluent dialogues despite significant AI progress

Alexa powered by a large voice model could lead to Amazon losing control

Amazon lays off employees for AI and conversation

Share

Bank details