Ad
Skip to content

Deepmind expert says trimming documents improves accuracy despite large context windows

How useful are million-token context windows, really? In a recent interview, Nikolay Savinov from Deepmind explained that when a model is fed many tokens, it has to distribute its attention across all of them. This means focusing more on one part of the context automatically leads to less attention for the rest. To get the best results, Savinov recommends including only the content that is truly relevant to the task.

I'm just talking about-- the current reality is like, if you want to make good use of it right now, then, well, let's be realistic.

Nikolay Savinov

Recent research supports this approach. In practice, this could mean cutting out unnecessary pages from a PDF before sending it to an AI model, even if the system can technically process the entire document at once.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

AI news without the hype
Curated by humans.

  • More than 16% discount.
  • Read without distractions – no Google ads.
  • Access to comments and community discussions.
  • Weekly AI newsletter.
  • 6 times a year: “AI Radar” – deep dives on key AI topics.
  • Up to 25 % off on KI Pro online events.
  • Access to our full ten-year archive.
  • Get the latest AI news from The Decoder.
Subscribe to The Decoder