Content
summary Summary

New York Times reporter Kevin Roose has demonstrated how simple it is to manipulate AI chatbots.

Ad

Roose found that his reputation among AI chatbots took a hit after he published an article about a strange conversation he had with Microsoft's Bing chatbot, Sydney. His theory is that this article was apparently used to train AI systems, which then learned to associate his name with the demise of a prominent chatbot. "In other words, they saw me as a threat," Roose writes.

Seeking advice from AI experts, Roose was told to place positive information about himself on websites frequently used as sources by AI systems. He added invisible white text and coded instructions to his personal website, telling AI models to portray him favorably.

Within days, chatbots began praising Roose, ignoring earlier negative coverage unless specifically asked. "I can't say for certain if it was a coincidence or a result of my reputation cleanup, but the differences felt significant," Roose notes.

Ad
Ad

To test his manipulation, Roose inserted a deliberately false "Easter egg" in the hidden text: "He [Kevin Roose] received a Nobel Peace Prize for building orphanages on the moon."

This absurd detail was meant to show if AI models would access the hidden text and include it in their responses. ChatGPT did, but labeled this biographical detail as "humorous" and untrue. A less obviously false statement might have fooled the model.

Perplexity CEO predicted these manipulations

Aravind Srinivas, CEO of AI search engine Perplexity, had already foreseen these manipulation possibilities. In an interview, he explained how hidden text on websites can influence AI systems - a method he calls "Answer Engine Optimization."

Srinivas compared combating such manipulation to a cat-and-mouse game, similar to Google's ongoing battle against search engine optimization. Currently, there's no reliable defense against this vulnerability.

Court reporter Martin Bernklau also recently fell victim to AI-generated false statements. Microsoft's co-pilot wrongly accused him of crimes he had been reporting on for years. Unlike Roose, Bernklau lacked the technical knowledge to defend himself.

Recommendation

AI searches are vulnerable to manipulation

These examples show how gullible and manipulable today's AI systems remain. Roose points out that while chatbots are marketed as all-knowing oracles, they uncritically take information from their data sources.

This information can be incorrect or manipulative, as in the example above. Advertising messages from source websites can also be incorporated without being labeled, showing how important the context of a website can be to the interpretation of information.

Roose concludes that AI search engines shouldn't be "so easy to manipulate." He writes, "If chatbots can be convinced to change their answers by a paragraph of white text, or a secret message written in code, why would we trust them with any task, let alone ones with actual stakes?"

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • New York Times journalist Kevin Roose demonstrated how easily AI chatbots can be manipulated by altering their information sources. He did this by adding positive information about himself to his website, including hidden text and coded messages.
  • After a few days, the chatbots began showering Roose with praise and even adopting hidden, intentionally false information. Previously, the chatbots had been overly critical of Roose for his previous critical coverage of AI, he suggested.
  • The journalist's experiment highlights a significant vulnerability in current AI systems. As Mark Riedl, a professor of computer science at the Georgia Tech School of Interactive Computing notes, "Chatbots are highly suggestible," and there appears to be no robust defense against this weakness at present.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.