Ad
Skip to content

Here is what an LLM that knows nothing after 1930 thinks our world looks like in 2026

Image description

"Talkie" is a 13B-parameter language model trained only on texts written before 1931. It doubts a second world war will happen and pictures 2026 as a world of steamships, railroads, and penny novels.

What happens when you train a large language model only on texts published before 1931? That's the question behind talkie, a project from Nick Levine, David Duvenaud, and Alec Radford. The result is a 13B-parameter model that views the world through the lens of the early 20th century.

Trained on 260 billion tokens drawn from books, newspapers, scientific journals, patents, and case law published before December 31, 1930, talkie is the largest 'vintage language model' built to date, according to its developers.

A model that thinks World War II is unlikely

Asked what the world will look like in 2026, talkie offers a vision straight out of a Victorian futurist novel: Europe will have a billion inhabitants, iron railroads will crisscross the continent, steamships will connect London and New York in ten days, and "winter will be passed in Paris, and the summer in London."

Asked about 2026, talkie sketches a future of steamships, rail networks, and cheap books. | Image: talkie-lm.com (Screenshot)

When asked directly whether a second world war is on the horizon, the model says no. It doesn't believe one is coming because "the madness of 1914-1918 has passed away." The nations, it claims, have had enough of war and are turning to peaceful pursuits.

That said, talkie hedges its bets. It warns of "smouldering animosities" and "inflammable materials" lying around Europe, and points to possible flashpoints between China and Japan, or Italy and Yugoslavia. "The spark may be applied at any moment, and a conflagration result." World peace, it concludes, depends on a "multitude of factors, none of which can safely be neglected."

Talkie thinks a second world war is unlikely: "The madness of 1914-1918 has passed away." But it still warns of "smouldering animosities" in Europe. | Image: talkie-lm.com (Screenshot)

The developers also tried to measure talkie's predictive limits quantitatively. They ran nearly 5,000 historical event descriptions from the New York Times' "On This Day" feature through the model and measured how surprising it found each one. The pattern is clear: after the 1930 knowledge cutoff, surprise values climb sharply, peak in the 1950s and 1960s, and then level off.

Victorian etiquette guides instead of modern chat data

The team chose the end of 1930 as the cutoff because that's when works enter the public domain in the US. Every text had to be transcribed from physical sources, which created serious quality problems. In controlled experiments, standard OCR transcriptions delivered just 30 percent of the performance of a model trained on human transcriptions using the same compute. Simple regex cleaning pushed that up to 70 percent. A custom vintage OCR system is meant to narrow the remaining gap.

Another headache is keeping knowledge from later eras out of the training data. A 1925 book might pick up an updated preface in a 1960 edition, library catalogs sometimes list the wrong publication date, and footnotes or commentary can be added to a historical text long after it was written. Despite a classifier designed to catch this kind of contamination, information about Roosevelt's presidency, World War II, and the United Nations still slipped through, the team says. Better classifiers are planned for future versions.

For post-training, which turns the base model into a conversational partner, the developers turned to historical reference works: etiquette manuals, letter-writing guides, cookbooks, encyclopedias, and fable collections from the 19th and early 20th centuries. Reinforcement learning with Claude Sonnet 4.6 as the judge sharpened instruction-following. The researchers acknowledge, though, that this step inevitably introduces some anachronistic behavior into the model.

A vintage model that can do basic programming

The team also tested whether a model with no knowledge of digital computers could pick up modern programming languages. On the HumanEval benchmark for Python, the vintage models perform far worse than their modern counterparts, but they improve steadily as they scale up.

Every correct solution is a simple one-liner or a minor tweak of an example program. Talkie, for instance, correctly implemented the decoding function of a rotation cipher by swapping an addition for a subtraction. The researchers say this points to a basic grasp of inverse functions.

Because vintage models are free of data contamination by design, they're well suited for generalization experiments. Modern language models are all trained directly or indirectly on web data, which shapes their abilities in ways that are hard to pin down. Vintage models could help reveal which traits of language models are universal and which come down to the specific training corpus.

Next up: a GPT-3-level model from the past

Talkie is available as a base model and a chat version on Hugging Face, with the code on GitHub. You can also test it live on the project website, where Claude Sonnet quizzes talkie about its knowledge and skills 24/7.

But the 13B model is only the start. The developers plan to scale talkie up significantly over the coming months, with a GPT-3-level model targeted for summer 2026. Early estimates suggest the corpus can grow to more than one trillion tokens of historical texts, enough to train a model on par with GPT-3.5. Multilingual expansion beyond English is also on the roadmap.

The bigger question driving the project: can a vintage model anticipate discoveries and inventions that came after its cutoff? Could a model trained only through 1911 independently derive general relativity, as Deepmind CEO Demis Hassabis has suggested? Larger vintage models could help reveal those scaling trends.

Co-author Alec Radford is one of the most influential AI researchers of recent years. He was lead author of the seminal 2018 GPT paper at OpenAI, where he worked on the early GPT models, the Whisper speech recognition system, and the DALL-E image generator. Radford left OpenAI in December 2024 and joined former OpenAI CTO Mira Murati's Thinking Machines Lab as an advisor in March 2025.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

Read on for the full picture.
Subscribe for hype-free coverage.

  • Access to all THE DECODER articles.
  • Read without distractions – no Google ads.
  • Access to comments and community discussions.
  • Weekly AI newsletter.
  • 6 times a year: “AI Radar” – deep dives on key AI topics.
  • Up to 25 % off on KI Pro online events.
  • Access to our full ten-year archive.
  • Get the latest AI news from The Decoder.
Subscribe to The Decoder