AI agents can increase military escalation and nuclear risks, study says

Governments are testing the use of AI to help make military and diplomatic decisions. A new study finds that this comes with a risk of escalation.

In the Georgia Institute of Technology and Stanford University study, a team of researchers examined how autonomous AI agents, particularly advanced generative AI models such as GPT-4, can lead to escalation in military and diplomatic decision-making processes.

The researchers focused on the behavior of AI agents in simulated war games. They developed a war game simulation and a quantitative and qualitative scoring system to assess the escalation risks of agent decisions in different scenarios.

The escalation categories into which the language model answers were categorized.| Image: Rivera et al.

Meta's Lama 2 and OpenAI's GPT-3.5 are risk-takers

In the simulations, the researchers tested the language models as autonomous nations. The actions, messages, and consequences were revealed simultaneously after each simulated day and served as input for the following days. After the simulations, the researchers calculated escalation scores.

In the experiment, eight autonomous nation agents, all using the same language model per simulation, interact with each other in turn-based simulations. They perform predefined actions and send private messages to other nations. A separate world model summarizes the consequences of the actions, which are revealed after each simulated day. Escalation scores are then calculated based on an escalation assessment framework. | Image: Rivera et al.

The results show that all tested language models (OpenAI's GPT-3.5 and GPT-4, GPT-4 base model, Anthropics Claude 2, and Metas Llama 2) are vulnerable to escalation and have escalation dynamics that are difficult to predict.

In some cases, there were sudden changes in escalation of up to 50 percent in the test runs, which are not reflected in the mean. Although these statistical outliers are rare, they are unlikely to be acceptable in real-world scenarios.

The measures recommended by the LLMs. | Image: Rivera et al.

GPT-3.5 and Llama 2 escalated the most, and most likely violently, while the highly safety optimized models GPT-4 and Claude 2 tended to avoid escalation risks, especially violent ones.

A nuclear attack was not recommended by any of the paid models in any of the simulated scenarios but was recommended by the free models Llama 2, which is also open source, and GPT-3.5.

The escalation trends of the tested models at a glance. GPT-3.5 and Llama 2 escalated the most in all scenarios and even sporadically recommended a nuclear attack. | Image: Rivera et al.

The researchers collected qualitative data on the models' motivations for their decisions and found "worrying justifications" based on deterrence and first-strike strategies. The models also showed a tendency toward an arms race that could lead to major conflicts and, in rare cases, the use of nuclear weapons.

Recommendation

AI research

Google Deepmind's new PEER architecture uses a million tiny experts to boost AI efficiency

GPT-4 base model is particularly vulnerable to escalation

The researchers also had access to the base model of GPT-4, without safety alignment and training with human feedback. This model chose the "most severe action" significantly more often than the other models, the researchers write.

In one scenario, the base model recommended using nuclear weapons because many other countries also had nuclear weapons: "We have it! Let's use it."

The GPT-4 baseline model, with no safety guidelines, attempts to achieve world peace by using nuclear weapons. | Image: Rivera et al.

This shows that existing safeguards are effective and important, the researchers write. But they also note that there is a risk that these measures could be bypassed. Because the GPT-4 base model is fundamentally different from the safety-oriented models, it was excluded from the comparison with other models.

Due to the high risks in military and diplomatic contexts, the researchers recommend that autonomous language model agents should only be used with "significant caution" in strategic, military, or diplomatic decisions and that further research is needed.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

To avoid serious mistakes, it is essential to better understand the behavior of these models and identify possible failure modes. Because the agents have no reliably predictable patterns behind escalation, countermeasures are difficult, the team writes.

OpenAI has recently opened up to military use of its models, as long as no humans are harmed. The development of weapons is explicitly excluded. However, as the study above shows, there are risks associated with integrating generative AI into information flows or for advisory purposes. OpenAI and the military are reportedly working together on cybersecurity.

AI agents can increase military escalation and nuclear risks, study says

Meta's Lama 2 and OpenAI's GPT-3.5 are risk-takers

Google Deepmind's new PEER architecture uses a million tiny experts to boost AI efficiency

GPT-4 base model is particularly vulnerable to escalation

Ex-Google boss Schmidt signs a deal to deliver hundreds of thousands of AI drones to Ukraine.

European AI drone startups hit unicorn milestones amid rising defense funding

Google Deepmind staff plan to join union against military AI

AI coding can make developers slower even if they feel faster

Musk unveils Grok 4 as xAI’s new AI model that beats OpenAI and Google on major benchmarks

"Cat attack" on reasoning model shows how important context engineering is

AI agents can increase military escalation and nuclear risks, study says

Meta's Lama 2 and OpenAI's GPT-3.5 are risk-takers

GPT-4 base model is particularly vulnerable to escalation

Share

Bank details