Repeated "write better code" prompts can make AI-generated code 100x faster

A simple request to "write better code" helped Sonnet 3.5 create code that runs 100 times faster than its first attempt, while adding unexpected (and unwanted) features typically found in enterprise software.

BuzzFeed Senior Data Scientist Max Woolf recently ran an experiment: what happens when you repeatedly tell Claude 3.5 to "write better code"? The results were surprising - not just in terms of performance gains, but in what the LLM decided "better" meant.

Woolf started with a classic programming challenge: write Python code to find the difference between the largest and smallest numbers with a digit sum of 30 in a million random numbers between 1 and 100,000. After getting Claude's first solution, he simply kept prompting it to "write better code."

The results were dramatic. The original code took 657 milliseconds to run. By the final iteration, it was down to just 6 milliseconds - a 100x speedup. But raw performance wasn't the only surprise.

Enterprise features appear out of nowhere

In its fourth and final "write better code" iteration, Claude spontaneously transformed the code into what looked like an enterprise application, adding typical corporate features without being asked. This suggests the LLM has somehow linked the concept of "better code" with "enterprise-grade software" - an interesting window into its training.

Developer Simon Willison offers an explanation for why this iterative improvement works: language models like Claude start fresh with each new prompt - they don't build up knowledge or memory over time like humans do. When asked to "write better code," Claude gets the entire previous conversation as context but analyzes it like unfamiliar code it's seeing for the first time.

This explains the continuous improvements - the LLM approaches the code with fresh eyes each time, unconstrained by previous attempts or preconceptions about what the "right" solution should be.

Prompt Engineering isn't going anywhere

When Woolf tried again with more specific prompts, he got better results faster, but ran into subtle bugs that needed human fixes. He emphasizes that careful prompt engineering is worth the effort, and argues that as AI models get more sophisticated, precise guidance becomes more important, not less.

"Although it’s both counterintuitive and unfun, a small amount of guidance asking the LLM specifically what you want, and even giving a few examples of what you want, will objectively improve the output of LLMs more than the effort needed to construct said prompts", Woolf writes.

Recommendation

AI in practice

OpenAI's Operator and Computer-Using Agent bring autonomous AI agents closer to reality

Two bar charts compare code speedups over 4 iterations: on the left, casual use with moderate improvements, on the right, prompt engineering with higher but faulty values. — While simple follow-up questions initially improve code quality only minimally, targeted prompt engineering leads to significant performance gains right out of the box - though with an increased risk of bugs. | Image: Max Woolf

Interestingly, in both experiments, Claude skipped optimizations that seemed obvious to human developers, like removing duplicates or sorting numbers first. And even minor changes in how you phrase requests can dramatically affect Claude's output.

Woolf is careful not to overstate AI's capabilities. While the performance gains are impressive, human developers are still essential for verifying solutions and troubleshooting. Despite what some AI enthusiasts may claim, we're not yet in the era of perfect, automatically generated code.

"Even if LLMs can be wrong, one notable thing I learnt from these experiments is that they do have interesting ideas and tool suggestions even if the code output can’t be used as-is," Woolf writes.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Repeated "write better code" prompts can make AI-generated code 100x faster

Enterprise features appear out of nowhere

Prompt Engineering isn't going anywhere

OpenAI's Operator and Computer-Using Agent bring autonomous AI agents closer to reality

AI agents in GitHub and GitLab workflows create new enterprise security risks

Google gathers triple OpenAI's AI data through its search monopoly

OpenAI ordered to turn over 20 million ChatGPT chats to the New York Times

Physicist Steve Hsu publishes research built around a core idea generated by GPT-5

The ARC benchmark's fall marks another casualty of relentless AI optimization

DeepseekMath-V2 is Deepseek's latest attempt to pop the US AI bubble

Repeated "write better code" prompts can make AI-generated code 100x faster

Enterprise features appear out of nowhere

Prompt Engineering isn't going anywhere

Share

Bank details