Anthropic boosts RAG accuracy with context-aware retrieval

Ideogram prompted by THE DECODER

Anthropic has developed a method to improve the precision of knowledge database queries. The approach, called contextual retrieval, aims to provide more accurate answers by incorporating additional context.

Contextual retrieval addresses a key limitation of existing retrieval augmented generation (RAG) systems. When documents are split into smaller chunks for indexing, important contextual information is often lost.

Anthropic's solution adds a brief summary of the full document to each chunk before indexing. These context snippets are typically up to 100 words long.

Here's an example:

original_chunk = "The company's revenue grew by 3% over the previous quarter."

contextualized_chunk = "This chunk is from an SEC filing on ACME corp's performance in Q2 2023; the previous quarter's revenue was $314 million. The company's revenue grew by 3% over the previous quarter."

According to Anthropic, this new method can cut information retrieval error rates by up to 49 percent. When combined with additional result reranking, improvements of up to 67 percent are possible.

Anthropic notes that contextual retrieval can be integrated into existing RAG systems with minimal effort. The company has published a detailed implementation guide with code samples on GitHub.

Research backs context-aware approach

Recent work from Cornell University supports the effectiveness of context-aware retrieval. In a paper, researchers examined a similar technique called "Contextual Document Embeddings" (CDE). They developed two complementary methods for contextualized embeddings:

Contextual training: This reorganizes training data so each batch contains similar but hard-to-distinguish documents, forcing the model to learn more nuanced differences.
Contextual architecture: A two-stage encoder integrates information from neighboring documents directly into embeddings, allowing the model to account for relative term frequencies and other contextual cues.

The researchers found both methods yield improvements independently, but work best in combination. They've released their CDE model and a tutorial on Hugging Face.

In tests on the Massive Text Embedding Benchmark (MTEB), the CDE model achieved top scores for its size class. Experiments showed CDE offers particular advantages for smaller, domain-specific datasets in areas like finance or medicine. Improvements were also seen in tasks like classification, clustering and semantic similarity.

Recommendation

AI in practice

OpenAI launches Codex: Autonomous AI agents for software development

However, the researchers note it's unclear how CDE might impact massive knowledge bases with billions of documents. More investigation is also needed into optimal context size and selection.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Anthropic boosts RAG accuracy with context-aware retrieval

Research backs context-aware approach

OpenAI launches Codex: Autonomous AI agents for software development

Perplexity's valuation soared to $18 billion after its latest funding round

OpenAI CEO Sam Altman warns users not to trust ChatGPT agent with sensitive or personal data

Anthropic appears to tighten the usage limits for Claude code

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

Anthropic boosts RAG accuracy with context-aware retrieval

Research backs context-aware approach

Share

Bank details