Content
summary Summary

How harmful and dangerous is AI? One way to find out is through "red teaming".

Red teaming is a strategy used in many fields, including AI development. Basically, the "red team" is an independent group that tries to investigate or deliberately infiltrate the system, project, process, or whatever for vulnerabilities. The goal is to make the system more secure.

AI systems can also have such vulnerabilities or exhibit unexpected or undesirable behavior. This is where red teaming comes in: a red team in AI development acts as a kind of "independent auditor". It tests the AI, trying to manipulate it or find flaws in its processes, preferably before the system is deployed in a real environment.

OpenAI says it invested more than six months in red teaming GPT-4 and using the results to improve the model. According to the test results, the unfiltered GPT-4 was able to detail cyberattacks on military systems, for example.

Ad
Ad

Model and system level: Microsoft uses two-tier Red Teaming

Microsoft uses Red Teaming to investigate large foundational models, such as GPT-4, as well as at the application level, such as Bing Chat, which accesses GPT-4 with additional functionality. These investigations influence the development of the models and the systems through which users interact with the models, Microsoft says.

The tech giant says it has expanded its Red Team for AI and is committed to responsible AI in addition to security. With generative AI, Microsoft says there are two types of risks: intentional manipulation, which is the exploitation of security vulnerabilities by users with malicious intent, but also security risks that arise from the normal use of large language models, such as the generation of false information.

Microsoft cites Bing Chat, of all things, as an example of extensive red-teaming. This seems odd, since Bing Chat intentionally went live in an unsafe version and generated abusive responses. So much so that Microsoft initially had to reduce the number of chats not long after launch.

If anything, Bing Chat didn't seem to have undergone extensive security testing, and OpenAI reportedly warned Microsoft not to launch Bing Chat prematurely. But Microsoft didn't care because ChatGPT was already on its way to the moon.

AI needs more than your standard red team

Another challenge for AI red-teaming, according to Microsoft: Traditional red-teaming is deterministic-the same input produces the same output. AI red-teaming, on the other hand, has to work with probabilities.

Recommendation

Potentially harmful scenarios must be tested multiple times, and there is a wider range of possible harmful outcomes. For example, an AI attack might fail on the first attempt, but succeed later.

To make matters worse, AI systems are constantly and rapidly evolving, according to Microsoft. Therefore, AI requires a layered approach to defense that includes classifiers, meta prompts, and limiting conversational drift (when the AI goes down the wrong path). Microsoft provides guidance on red teaming large language models on its Azure Learning Platform.

Microsoft provides guidance on red teaming large language models on its Azure Learning Platform.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Red teaming is a strategy in which an independent team investigates potential vulnerabilities in AI systems to make them more secure.
  • OpenAI and Microsoft use red teaming to test foundational models like GPT-4 and applications like Bing Chat for vulnerabilities and minimize potential security risks.
  • According to Microsoft, AI red teaming is particularly challenging because it works with probabilities and AI systems are constantly evolving, requiring a layered defense approach.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.