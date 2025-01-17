AI research
Maximilian Schreiner

Google's new 'Titans' AI model gives language models long-term memory

Midjourney prompted by THE DECODER
Google's new 'Titans' AI model gives language models long-term memory
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Profile
E-Mail
Content
summary Summary

Google researchers have developed a new type of Transformer model that gives language models something similar to long-term memory. The system can handle much longer sequences of information than current models, leading to better performance across various tasks.

Ad

The new "Titans" architecture takes inspiration from how human memory works. By combining artificial short and long-term memory through attention blocks and memory MLPs, the system can work with long sequences of information.

One of the system's clever features is how it decides what to remember. Titans uses "surprise" as its main metric - the more unexpected a piece of information is, the more likely it gets stored in long-term memory. The system also knows when to forget things, helping it use memory space efficiently.

The team created three different versions of Titans, each handling long-term memory differently:
- Memory as Context (MAC)
- Memory as Gate (MAG)
- Memory as Layer (MAL)

Ad
Ad

While each version has its strengths, the MAC variant works especially well with very long sequences.

Image: Google

Better performance on long-context tasks

In extensive testing, Titans outperformed traditional models like the classic Transformer and newer hybrid models like Mamba2, particularly when dealing with very long texts. The team says it can handle context windows of more than 2 million tokens more effectively, setting new records for both language modeling and time series prediction with long contexts.

The system also excelled at the "Needle in the Haystack" test, where it needs to find specific information in very long texts. Titans achieved over 95% accuracy even with 16,000-token texts. While some models from OpenAI, Anthropic, and Google perform better, they're much larger - Titans' biggest version has only 760 million parameters.

Titans models also beat significantly larger language models in tasks that require an understanding of larger contexts. | Image: Google

Titans really showed its strength in the BABILong benchmark, a challenging test of long-term comprehension where models need to connect facts spread across very long documents. The system outperformed larger models like GPT-4, RecurrentGemma-9B, and Llama3.1-70B. It even beat Llama3 with Retrieval Augmented Generation (RAG), though some specialized retrieval models still perform better.

The team expects to make the code publicly available in the near future. While Titans and similar architectures could lead to language models that handle longer contexts and make better inferences, the benefits might extend beyond just text processing. The team's early tests with DNA modeling suggest the technology could improve other applications too, including video models - assuming the promising benchmark results hold up in real-world use.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
AI research

New approach improves AI agents through external 'world knowledge'

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Google researchers have developed a new type of Transformer model called "Titans" that gives language models something akin to long-term memory, allowing them to handle much longer sequences of information than current models and leading to improved performance across various tasks.
  • Titans combines artificial short and long-term memory through attention blocks and memory MLPs, using "surprise" as its main metric for deciding what information to store in long-term memory, and knowing when to forget things to use memory space efficiently.
  • In extensive testing, Titans outperformed traditional models and newer hybrid models, particularly when dealing with very long texts, and excelled at tasks requiring an understanding of larger contexts, such as the "Needle in the Haystack" test and the BABILong benchmark, even beating some larger language models.
Sources
Arxiv
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Profile
E-Mail
AI research

Grokking in machine learning: When Stochastic Parrots build models

News, tests and reports about VR, AR and MIXED Reality.
Somnium VR 1: New pricing for premium PC VR headsets announced Every now and then, virtual reality reveals its potential Nvidia's GeForce Now comes to Pico VR headsets and you can get a free Performance Membership MIXED-NEWS.com
Google News
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Google's new 'Titans' AI model gives language models long-term memory

Bank details

IBAN: DE87 1203 0000 1086 0070 75
Account holder: DEEP CONTENT GbR
Purpose: Support THE DECODER
AI research

MatterGen: Microsoft presents AI tools for generating and simulating new materials

AI in practice

Meta's LibGen controversy reveals how desperate AI companies are for quality training data

AI in practice

The great AI scaling debate continues into 2025

Google News