Content
summary Summary

In the midst of OpenAI's existential crisis, competitor Anthropic unveils its new language model and chatbot, Claude 2.1. It has a context window twice the size of its predecessor and is said to make fewer mistakes.

With a 200K context window, Anthropic's Claude 2.1 surpasses the large 100K context window of its predecessor, which itself was surpassed by GPT-4 Turbo with a 128K context window in early November. With the 200K context window, Anthropic is once again the provider with the most attentive AI model on the market.

The context window describes how much content the language model can look at simultaneously when generating an answer. In the case of Claude 2, this is approximately 150,000 words or more than 500 pages of material, according to Anthropic.

Chatting the Iliad

Users can upload entire code bases, financial reports, or even large literary works like the Iliad or the Odyssey for the model to process, Anthropic says.

Ad
Ad

Claude can perform tasks such as summarizing, question-and-answer, predicting trends, and comparing multiple documents. However, generating an answer can take several minutes - nothing compared to hours of human work, Anthropic points out.

Video: Anthropic

In practice, however, the benefits of these large context windows are still limited. Tests show that large language models retrieve content less reliably when that content is further back and more in the middle of the input, the so-called "lost in the middle" phenomenon. The larger the input, the greater the risk of error.

In practice, this means that you can input large documents, but parts of the document may not be included in the analysis. The model finds information most reliably at the beginning of documents, as the GPT-4 Turbo benchmarks show.

Independent benchmarks will show how good or bad Claude 2.1 is here. In any case, Anthropic promises significant improvements over its predecessor, especially for longer contexts.

Recommendation

The model shows a 30 percent reduction in incorrect answers and a "3-4x lower rate of mistakenly concluding a document supports a particular claim."

When the model is uncertain, it discards nearly twice as many answers and admits uncertainty ("I'm not sure what the fifth largest city in Bolivia is") as its predecessor.

Claude 2.1 should be more honest and understand things better

According to Anthropic, Claude 2.1 has reduced the hallucination rate by a factor of two compared to its predecessor, Claude 2.0. As a result, organizations can build AI applications with greater confidence and reliability.

With the new model, Anthropic is also introducing a beta feature called Tool Usage, which enables Claude to integrate with users' existing processes, products, and APIs. Claude can now orchestrate developer-defined functions or APIs, search web sources, and retrieve information from private knowledge bases.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

The developer console has been simplified for Claude API users to make it easier to test new calls and speed up the learning curve. The new Workbench allows developers to work on prompts in a playful environment and access new model settings to tweak Claude's behavior.

Video: Anthropic

Claude 2.1 is now available via API and supported by the chat interface on claude.ai for free and pro plans. The 200K token context window is reserved for Claude Pro users. Claude is currently available in 95 countries, but not in the EU.

Today's launch of Claude 2.1 could be a strategic move: Anthropic's competitor OpenAI is in deep crisis, and the heavily criticized OpenAI board is said to have even approached Anthropic's CEO about a merger. In addition, more than 100 OpenAI customers are said to have inquired about Anthropic's offerings.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Anthropic's Claude 2.1 introduces a 200,000 token context window, doubling the amount of data users can submit to Claude. According to Anthropic, this is the equivalent of about 150,000 words or more than 500 pages of material.
  • Claude 2.1 surpasses GPT-4 Turbo's 128K token context window, allowing users to upload entire codebases, financial reports, or massive literary works such as the Iliad or Odyssey. The model can perform tasks such as summary, question and answer, trend prediction, and multi-document comparison.
  • According to Anthropic, Claude 2.1 also offers a twofold reduction in the number of incorrect answers compared to the previous version, Claude 2.0.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.