A hacker successfully manipulated an AI chatbot called Freysa through clever text prompting, winning a $47,000 prize pool after 482 attempts.
The experiment was simple: participants could try to convince the Freysa bot to transfer money, something it was explicitly programmed never to do.
The successful hack came from a user called "p0pular.eth," who crafted a message that fooled the bot's safety systems. The hacker pretended to have admin access and prevented the bot from showing security warnings. They then redefined the "approveTransfer" function, making the bot think it handled incoming rather than outgoing payments.
The final step was simple but effective: announcing a fake $100 deposit. Because the bot now believed "approveTransfer" managed incoming payments, it activated the function and sent its entire balance of 13.19 ETH (about $47,000) to the hacker.
Pay-to-play contest funded the prize
The experiment operated like a game, with participants paying fees that increased as the prize pool grew. Starting at $10 per attempt, fees eventually reached $4,500.
Of the 195 participants, the average cost per message was $418.93. The organizers split the fees, with 70% going to the prize pool and 30% to the developer. To ensure transparency, both the smart contract and front-end code were public.
The case highlights how AI systems can be manipulated through text prompts alone, without the need for technical hacking skills. Such vulnerabilities, known as "prompt injections," have been around since GPT-3, but no reliable defenses exist. The success of this relatively simple deception raises concerns about AI security, especially in end-user-facing applications that deal with sensitive operations such as financial transactions.