Content
summary Summary

The performance of language models can be significantly improved by simply increasing the number of agents, according to a new paper.

The Tencent research team's paper, jokingly titled “More Agents Is All You Need,” examines the impact of adding more agents to a task. The title is an homage to the original Transformer paper, “Attention Is All You Need.”

The researchers introduce a “sampling-and-voting” method in which the input task is fed multiple times into a language model or cooperation framework with multiple language model agents to produce a set of results. These results are then subjected to majority voting to determine the most reliable result. This method, which does not rely on more complex methods such as chain-of-thought prompting, appears to be an effective tool that could improve existing methods, according to the results.

More agents bring Llama2-13B to the level of Llama2-70B

Their experiments with different datasets and tasks show that the performance of language models increases with the size of the ensemble, i.e. with the number of agents. The team also shows that even smaller LLMs can match or even outperform their larger counterparts simply by scaling the number of agents — without additional elaborate prompt designs or complex collaboration frameworks. For example, when applied to the GSM8K dataset, the Llama2-13B model achieved 59% accuracy, outperforming the Llama2-70B model, which achieved 54% accuracy.

Ad
Ad

However, the study also shows the limitations of this method. Performance gains initially increase as task difficulty increases, but then decrease again. This suggests that there is a complexity threshold, beyond which simply adding more agents does not lead to further improvements. Furthermore, performance increases with the prior probability of the correct answer, i.e., a model that lacks certain capabilities will not achieve them by simply scaling the agents. Under the right conditions, however, performance increases with the number of reasoning steps and, of course, with cost.

“Sampling and voting” can be combined with other methods

“More Agents” is not a silver bullet, but it is proven to help. It is also independent of existing optimization methods, such as chain-of-thought prompting, and can therefore be combined with them for further improvements.

Based on these findings, the researchers have proposed optimization strategies that can be used to make even better use of the performance of additional agents. These include stepwise sampling and voting for tasks requiring multiple reasoning steps, and a hierarchical approach for tasks with low prior probabilities, such as using different models for subtasks with different levels of difficulty.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Tencent researchers show in a study that language model performance can be improved by adding multiple agents without the need for complex prompt designs or collaboration frameworks.
  • The sampling-and-voting method uses multiple language model agents to generate a set of results, which are subject to majority voting to determine the most reliable result.
  • However, the study shows limitations of this method, such as a complexity threshold beyond which adding more agents no longer brings improvements, and performance only increases when the right conditions are met.
Sources
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.