TransAgents uses AI teamwork to tackle the complexities of literary translation

May 26, 2024

Wu et al.

Key Points

Researchers from Monash University, the University of Macau and Tencent AI Lab have developed TransAgents, a literary translation system that simulates a translation agency with different AI agents in different roles.
The agents are given detailed profiles and work together in a multi-stage process to create, review and improve translations.
Although TransAgents performs worse on traditional metrics, human reviewers and an LLM reviewer prefer its translations over human-written references and GPT-4 translations. However, there are limitations, such as missing relevant content.

Instead of using a single instance of an LLM, a research team has multiple LLM-based agents work together to translate a text. This approach improves quality, but it has some problems.

Researchers from Monash University, the University of Macau, and Tencent AI Lab have developed "TransAgents" to translate long works of literature. The team says literary translation is very difficult for machines because of complex language, metaphors, cultural subtleties, and unique styles.

The TransAgents system simulates a translation agency with AI agents in different roles. It has two main stages: recruitment and translation, each with sub-steps. First, a CEO agent selects a lead editor based on the client's needs. The editor then builds a team of junior editors, translators, localization experts, and proofreaders.

The researchers gave each AI employee a detailed profile made with GPT-4. These go beyond language skills to make the simulation more realistic and reflect the diversity of real agencies.

Example of an AI agent profile in TransAgents. | Image: Wu et al.The translation process uses strategies such as "addition-by-subtraction," where one agent extracts key information and another cuts out redundant parts and provides feedback. In "trilateral collaboration," agents are assigned to create, review, and approve translations. To keep things consistent, the researchers created a style guide with a glossary, book summary, tone, style, and target audience.

To test the literary translations, the researchers propose two approaches:

1. Monolingual Human Preference (MHP) - Target audience reviewers rate the flow, readability, and cultural fit of the translation without seeing the original.

2. Bilingual LMM Preference (BLP) - An LMM compares the translation directly to the source text.

The results show that TransAgents performed worst on d-BLEU scores, which evaluate machine translations. But both human and AI reviewers liked its translations better than human-made references and GPT-4 versions. TransAgents beat human translations in genres requiring specialized knowledge, such as history and culture. But it fell short in modern genres.

The researchers also found that TransAgents used more varied and vivid descriptions than other systems, and a cost analysis shows that using TransAgents could be 80 times cheaper than human translators for literature.

While the study praised the results, it also noted major limitations of AI translation systems, specifically that they can skip over important content.

Text in red and blue was left out by TransAgents. | Image: Wu et al.

Machine translation has come a long way, but its potential has exploded with generative AI. Previous research has shown that systems with multiple AI agents can improve performance.

The latest study, "More Agents Is All You Need," comes from Tencent AI Lab, which also helped develop TransAgents. The convergence of these two research areas was a natural fit.

But until the skipping problem is solved, these systems can only assist humans in literary translation, not replace them because close supervision is still required. The same was true in a recent Stanford study that examined generative AI tools for assisting with legal tasks.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Arxiv