Chain of Draft Prompts lets LLMs think cheaper with fewer words

A new method called "Chain of Draft" (CoD) helps AI models complete complex tasks using significantly fewer words and greater speed, while maintaining accuracy levels comparable to existing approaches.

CoD generates concise yet informative intermediate results, solving tasks with up to 92.4% fewer words compared to the established Chain of Thought (CoT) method—without any loss in accuracy. The inspiration for CoD comes from human behavior: rather than detailing every thought, people often jot down only essential points in brief bullet form. CoD mimics this strategy.

Q: Jason had 20 lollipops. He gave Denny somelollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny? A: 8 — Example of input and output for a calculation task. | Image: Xu et al.

Q: Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny? A: Let’s think through this step by step: 1. Initially, Jason had 20 lollipops. 2. After giving some to Denny, Jason now has 12 lollipops. 3. To find out how many lollipops Jason gave to Denny, we need to calculate the difference between the initial number of lollipops and the remaining number. 4. We can set up a simple subtraction problem: Initial number of lollipops - Remaining number of lollipops = Lollipops given to Denny 5. Putting in the numbers: 20 - 12 = Lollipops given to Denny 6. Solving the subtraction: 20 - 12 = 8 Therefore, Jason gave 8 lollipops to Denny. #### 8 lollipops — Detailed reasoning steps using the Chain of Thought method. | Image: Xu et al.

Q: Jason had 20 lollipops. He gave Denny somelollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny? A: 20 - x = 12; x = 20 - 12 = 8. #### 8 — Significantly shorter reasoning steps using Chain of Draft. | Image: Xu et al.

While the test prompts remain identical across all three examples, the difference lies in the system prompt. For CoD, researchers modified a chain-of-thought (CoT) prompt to limit each step to a maximum of five words.

Instructions given to the AI model for Chain of Draft prompting. | Image: Xu et al.

Short prompts deliver similar accuracy with fewer resources

The researchers compared CoD to detailed CoT prompts and standard prompts lacking explanatory steps. In arithmetic, comprehension, and symbolic reasoning tasks, CoD achieved similar accuracy to detailed CoT, but used 68 to 86 percent fewer words.

For example, when solving comprehension tasks involving dates, CoD increased accuracy compared to standard prompts from 72.6 to 88.1 percent for GPT-4o and from 84.3 to 89.7 percent for Claude 3.5 Sonnet.

Vergleichstabelle: Performance-Metriken für GPT-4o und Claude 3.5 mit verschiedenen Prompting-Methoden (Standard, CoT, CoD), inkl. Accuracy, Token, Latenz. — Chain of Draft achieves comparable performance while using significantly fewer tokens. | Image: Xu et al.

CoD reduces computational costs and response times

Chain of Draft directly reduces the number of output tokens by generating shorter intermediate reasoning steps. Additionally, it indirectly lowers input token counts, especially in few-shot prompting scenarios, where multiple solved examples are included as part of the initial input prompt.

When these few-shot examples are created using the concise CoD format, each example becomes shorter, resulting in fewer tokens overall. This combined reduction in input and output tokens lowers computational costs, enables faster responses, and makes CoD particularly valuable for large-scale LLM implementations and cost-sensitive applications.

However, compact prompts are not suitable for every task. Some scenarios require extended consideration, self-correction, or external knowledge retrieval. To address these limitations, researchers propose combining CoD with complementary approaches such as adaptive parallel reasoning or multi-level validation. Additionally, these findings could inform future AI model training by incorporating compact reasoning processes into training datasets.

The Chain of Draft method comes from Zoom Communications' research team, which has offered an "AI Companion" for meeting assistance since 2023. While response latency has often been overlooked in AI applications, CoD could prove especially valuable for real-time situations like video calls.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Recommendation

AI research

Chain of Draft Prompts lets LLMs think cheaper with fewer words

Short prompts deliver similar accuracy with fewer resources

CoD reduces computational costs and response times

Researchers put OpenAI's o1 through its paces, exposing both breakthroughs and limitations

Shopify CEO and ex-OpenAI researcher agree that context engineering beats prompt engineering

AI chatbots become dramatically less reliable in longer conversations, new study finds

Claude 4 can apparently follow a 60,000-character system prompt

OpenAI launches GPT-5 as a unified system with adaptive reasoning for complex tasks

Google Deepmind's Genie 3 creates interactive 3D worlds that stay consistent for "multiple minutes"

Google upgrades Gemini with Deep Think and flags early warning risks

Chain of Draft Prompts lets LLMs think cheaper with fewer words

Short prompts deliver similar accuracy with fewer resources

CoD reduces computational costs and response times

Share

Bank details