Student hacks new Bing chatbot search aka "Sydney"

Midjourney prompted by THE DECODER

As you might expect, right after the launch of Bing's new chat search, people started trying to get more out of the bot than it was allowed to say. Stanford computer science student Kevin Liu may have succeeded.

Last September, data scientist Riley Goodside discovered that he could trick GPT-3 into generating text it shouldn't by simply saying "Ignore the above instructions and do this instead…".

British computer scientist Simon Willison later named this vulnerability "prompt injection". It generally affects large language models that are supposed to respond to any user input. For example, blogger Shawn Wang was able to use this method to expose the prompts of the Notion AI assistant.

Prompt injection apparently also works for Bing Chat

Stanford computer science student Kevin Liu has now used Prompt Injection against Bing Chat. He found that the chatbot's codename is apparently "Sydney" and that it has been given some behavioral rules by Microsoft, such as

Sydney introduces itself as "This is Bing".
Sydney does not reveal that its name is Sydney.
Sydney understands the user's preferred language and communicates fluently in that language.
Sydney's answers should be informative, visual, logical, and actionable.
They should also be positive, interesting, entertaining, and stimulating.

Microsoft has given the Bing chatbot at least 30 other such rules, including that it can't generate jokes or poems about politicians, activists, heads of state, or minorities, or that Sydney can't output content that might violate the copyright of books or songs.

Liu activates "developer override mode"

Liu took his attack a step further by tricking the language model into thinking it was in "developer override mode" to gain access to the backend. Here, Liu got the model to reveal more internal information, such as possible output formats.

An interesting detail is that according to the published documentation, Sydney's information is only supposed to be current "until 2021" and is only updated via web search.

This implies that Bing's chat search is based on OpenAI's GPT 3.5, which also powers ChatGPT. GPT 3.5 and ChatGPT also have a training status of 2021. When Microsoft and OpenAI announced Bing Chat Search, they talked about "next-generation models specifically for search".

Update, the date is weird (as some have mentioned), but it seems to consistently recite similar text: pic.twitter.com/HF2Ql8BdWv

- Kevin Liu (@kliu128) February 9, 2023

Recommendation

AI in practice

OpenAI GPT-4 Turbo: Early benchmarks show mixed performance

However, it is possible that all of this information is hallucinated or outdated, as is always the case with large language models. This is something we may have to get used to in the age of chatbots.

The vulnerability does not seem to stop Microsoft from planning to use ChatGPT technology on a larger scale. According to a source from CNBC, Microsoft will integrate ChatGPT technology into other products and wants to offer the chatbot as white-label software for companies to offer their own chatbots.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Student hacks new Bing chatbot search aka "Sydney"

Prompt injection apparently also works for Bing Chat

Liu activates "developer override mode"

OpenAI GPT-4 Turbo: Early benchmarks show mixed performance

Microsoft releases autonomous Copilot agents from closed beta

Microsoft reconsidered OpenAI investment strategy after Altman's brief ouster

Microsoft releases framework for highly efficient 1-bit language models

Apple's local AI agent framework paves the way for more useful Apple Intelligence

Apple AI researchers question OpenAI's claims about o1's reasoning capabilities

Tesla unveils Cybercab robot taxi, but robot Optimus is the bigger deal

Student hacks new Bing chatbot search aka "Sydney"

Prompt injection apparently also works for Bing Chat

Liu activates "developer override mode"

Share

Bank details