New research shows that large language models (LLMs) such as GPT-4 can significantly outperform humans in terms of persuasiveness in face-to-face debate situations.
In a controlled study, researchers from the École Polytechnique Fédérale de Lausanne (EPFL) and the Italian Fondazione Bruno Kessler investigated the persuasive power of large language models such as GPT-4 in direct comparison with humans.
Participants were randomly divided into different groups and debated controversial topics. The research team tested four different debate situations: Human vs. human, human vs. AI, human vs. human with personalization, and human vs. personalized AI model.
In the personalized version, the debaters also had access to anonymized background information about their opponents.
The result: GPT-4 with access to personal information was able to increase participants' agreement with their opponents' arguments by a remarkable 81.7 percent compared to debates between humans.
Without personalization, GPT-4's advantage over humans was still positive at 21.3 percent, but not statistically significant.
The researchers attribute this persuasive advantage of personalized AI to the fact that the language model skillfully uses information from the participant's profile to formulate tailored, persuasive arguments.
According to the researchers, it is troubling that the study used only rudimentary background data for personalization - and yet GPT-4's persuasiveness was already so significant.
Malicious actors could generate even more detailed user profiles from digital traces, such as social media activity or purchasing behavior, to further enhance the persuasive power of their AI chatbots. The study suggests that such AI-driven persuasion strategies could have a major impact in sensitive online environments such as social media.
The researchers strongly recommend that online platform operators take steps to counter the spread of such AI-driven persuasion strategies. One possibility would be to use similarly personalized AI systems that counter misinformation with fact-based counterarguments.
Limitations of the study include the random assignment of participants to pro or con positions, regardless of their prior opinions, and the predetermined structural format of the debates, which differs from the dynamics of spontaneous online discussions.
Another limitation is the time limit, which can potentially limit participants' creativity and persuasiveness, especially in the personalization condition where participants have to process additional information.
The comprehensive study was conducted between December 2023 and February 2024 and funded by the Swiss National Science Foundation and the European Union. It is published as a preprint on arXiv.
OpenAI CEO Sam Altman recently warned about the superhuman persuasive power of large language models: "I expect AI to be capable of superhuman persuasion well before it is superhuman at general intelligence, which may lead to some very strange outcomes."