Anthropic has added a prompt optimizer and example management features to its AI development console. The prompt optimizer uses Claude to automatically refine existing prompts using prompt engineering techniques.
The six-step optimization process takes less than a minute. Users input a prompt and specify which aspects need improvement. Claude then creates an improvement plan, writes an initial draft, revises it, and returns the optimized prompt. However, there is no automated testing against benchmarks.
We've added a new prompt improver to the Anthropic Console.
Take an existing prompt and Claude will automatically refine it with prompt engineering techniques like chain-of-thought reasoning. pic.twitter.com/aI3BipX1DG
- Anthropic (@AnthropicAI) November 14, 2024
How the optimizer works
According to a blog post, the prompt optimizer uses various methods to improve prompts:
- Chain-of-thought reasoning: Adds a dedicated section for Claude to think through problems systematically before responding to improve accuracy and reliability.
- Example standardization: Converts examples into a consistent XML format for improved clarity and processing.
- Example enrichment: Augments existing examples with chain-of-thought reasoning that aligns with the newly structured prompt.
- Rewriting: Rewrites the prompt to clarify structure and correct any minor grammatical or spelling issues.
- Prefill addition: Prefills the Assistant message to direct Claude’s actions and enforce output formats.
Anthropic says the prompt optimizer makes it easier to implement prompt engineering best practices or optimize prompts developed for other language models for use with Claude. In tests, the optimizer increased accuracy on a classification test by 30 percent.
Developers can also manage examples in a structured format in the workbench and have Claude automatically generate them when needed. The new features are available to all users in the Anthropic Console. Anthropic had previously introduced another prompt optimizer via Colab and capabilities for prompt tuning and evaluation in the console.