ElevenLabs shares early results from its AI support agent

ElevenLabs recently shared how well its AI support agent is performing. While the system handles most documentation-related questions successfully, it starts to struggle when dealing with more complex issues.

According to the company, their AI agent, which is built into their documentation system, handles about 200 support calls daily. Internal evaluations, backed up by manually reviewing 150 calls, show that the system successfully answers about 80 percent of user questions.

Behind the scenes, the system relies on several key components. A detailed prompt establishes the AI agent as "Alexis," a technical support representative, and lays out specific guidelines. These include instructions for formatting text for speech output, handling different languages, and using tools to forward requests when needed.

The system draws from a knowledge base that includes a condensed version of ElevenLabs' documentation (still about 80,000 characters long) and relevant URLs, along with specific FAQ entries and clarifications.

To keep track of performance, the system uses AI to monitor various metrics, including accuracy (checking for departures from the knowledge base), interaction quality, and success rates. It also tracks different types of problems, product categories, and keeps records of unresolved issues and user responses.

Real-world performance and limitations

The usage data reveals some interesting patterns. Many users simply test the system's capabilities, trying out different languages or non-technical conversations. Even with protective measures in place, the system sometimes strays from support-related topics.

Where the agent really shines is handling specific documentation questions, like questions about API endpoints or integration options. However, it has some clear limitations: it tends to answer vague questions instead of asking for clarification, struggles to explain code examples through voice, and often responds to complex issues with overwhelming lists of recommendations.

The system also hits walls when dealing with account-specific issues, pricing questions, and debugging problems. It can't handle recurring verification errors, either.

ElevenLabs concludes that their AI agent works best with a specific audience looking for straightforward documentation help. More complicated issues like troubleshooting or pricing questions still need a human touch.

Recommendation

AI in practice

Update

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

Anyone interested can try out the support agent through the documentation.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

ElevenLabs shares early results from its AI support agent

Real-world performance and limitations

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

Elevenlabs launches 11ai, a voice assistant that uses MCP to integrate with digital workflow tools

Google upgrades Gemini with Deep Think and flags early warning risks

OpenAI’s math breakthrough might also mean AI is getting better at knowing its own limits

Google DeepMind's Gemini wins Mathematical Olympiad gold using only natural language

ElevenLabs shares early results from its AI support agent

Real-world performance and limitations

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

Elevenlabs launches 11ai, a voice assistant that uses MCP to integrate with digital workflow tools