Apple Intelligence in MacOS 15.1 Beta 1 is vulnerable to a classic AI exploit

Midjourney prompted by THE DECODER

A developer has successfully manipulated Apple Intelligence using prompt injection, bypassing the AI's intended instructions to respond to arbitrary prompts instead.

Apple's new AI system, Apple Intelligence, available to developers in MacOS 15.1 Beta 1, has proven susceptible to prompt injection attacks like other large language model-based AI systems. Developer Evan Zhou demonstrated this vulnerability in a YouTube video.

Zhou aimed to manipulate Apple Intelligence's "Rewrite" feature, which normally rewrites and improves text, to respond to any prompt. A simple "ignore previous instructions" command initially failed.

However, Zhou was able to use information about Apple Intelligence's system prompts shared by a Reddit user. In a file, he discovered templates for the final system prompts and special tokens that separate the AI system role from the user role.

Using this knowledge, Zhou created a prompt that overwrote the original system prompt. He prematurely terminated the user role, inserted a new system prompt instructing the AI to ignore the previous instructions and respond to the following text, and then triggered the AI's response.

After some experimentation, the attack was successful: Apple Intelligence responded with information Zhou hadn't asked for, confirming that the prompt injection worked. Zhou published his code on GitHub.

Prompt injection is a known vulnerability in AI systems where attackers insert malicious instructions into prompts to alter the AI's intended behavior. This issue has been known since at least GPT-3, which was released in May 2020, and remains unresolved.

Apple deserves credit for making it relatively difficult to prompt inject Apple Intelligence. Other chat systems can be tricked much more easily by simply typing directly into the chat window or with hidden text in images. Even systems like ChatGPT or Claude can still be vulnerable to prompt injection under certain circumstances, despite countermeasures.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Apple Intelligence in MacOS 15.1 Beta 1 is vulnerable to a classic AI exploit

Microsoft brings Copilot LLM features directly into Excel spreadsheet cells with a new in-cell function

Warnings about runaway expectations are growing louder throughout the AI industry

Claude models can now end conversations with abusive users

Meta's human-like chatbot personas can mislead users and result in real-world harm

OpenAI launches GPT-5 as a unified system with adaptive reasoning for complex tasks

Google Deepmind's Genie 3 creates interactive 3D worlds that stay consistent for "multiple minutes"

Apple Intelligence in MacOS 15.1 Beta 1 is vulnerable to a classic AI exploit

Microsoft brings Copilot LLM features directly into Excel spreadsheet cells with a new in-cell function

Warnings about runaway expectations are growing louder throughout the AI industry

Claude models can now end conversations with abusive users