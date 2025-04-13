The Decoder: Mr. King, let’s start with a simple question: How did Meta become interested in neuroscience in the first place? At first glance, it seems like an unusual path – from a social network to neuroscience research.

Jean Rémi King: Yes, I work at Meta within FAIR, the Fundamental AI Research lab. It was launched by Yann LeCun a little over ten years. The idea back then was to establish a lab dedicated to fundamental AI research. Even at that time, the broader industry—and Mark Zuckerberg in particular—recognized how impactful AI would be for the tech sector. So, it was critical for the company to remain at the cutting edge of knowledge in this area.

FAIR has grown quite a bit since then. Initially, most researchers were working in computer vision and natural language processing. At some point, there was a decision to ensure a more diverse portfolio of researchers, so that not everyone was thinking in the same way. A few physicists were hired, and I was brought on as a neuroscientist—likely to broaden that portfolio.

This didn’t come out of nowhere. Neuroscience and AI have been intertwined from the beginning. That’s why we talk about artificial neural networks. The idea of hierarchical layers in algorithms actually originates in systems neuroscience, and the two fields have shared many links over the years. I believe Yann and Joelle Pineau saw the importance of continuing to push in that direction, and that’s probably why I was hired.

That said, I always feel a bit awkward answering this question—no one ever told me this directly. I was just hired and then given the freedom to continue the research I had already been working on.

The Decoder: Has your work always been situated at the intersection of AI and neuroscience?

Jean Rémi King: I did my undergraduate degree in AI and cognitive science more than 20 years ago now, which feels a bit daunting to admit. Even then, I was positioned at the intersection of those two fields. As a teenager—and even as a kid—I was fascinated by robotics and the idea of building intelligent systems. Of course, back then, it was something of an AI winter.

After my undergraduate studies, I started to think that neuroscience might be a bit more mature as a field, so I shifted away from computer science. I pursued my master’s and PhD more heavily on the neuroscience side, using machine learning algorithms mainly as tools to analyze complex data—rather than as a means to build intelligent systems. At the time, it felt more like statistics on steroids than a scientific goal in itself.

But around 2011–2012, things began to accelerate in what we now call deep learning. That’s when I returned to the frontier between neuroscience and AI, this time with the goal of exploring whether there are general principles that shape our own reasoning—principles that could also apply to algorithms.

The Decoder: Has your research with AI changed your conceptual understanding of the brain?

Jean Rémi King: I think studying the brain is one of the ways you’re forced to reconsider what thinking really means. AI today also makes it clear that some of the concepts we take for granted—like reasoning or thinking—may need to be re-evaluated in light of what deep learning algorithms are now capable of.

For those of us working in the field, this is a profound source of curiosity and wonder. The idea that intelligence and reasoning can emerge from something as mechanical as cells interacting—action potentials firing in the brain—is a deeply compelling question.

So it’s not that AI made me rethink these ideas; rather, I was already deeply intrigued by the notion that thinking, at its core, must be grounded in physics. That’s what drew me to the field in the first place, and I think many of my peers followed a similar path.

The Decoder: Do you have a personal “favorite theory” of how the brain works? In your papers, you often mention predictive coding. Is that a framework you consider particularly promising?

Jean Rémi King: That’s a tricky one, because I think many of us have a love-hate relationship with predictive coding. It’s a framework that was first popularized by Rao and Ballard in the 1990s, and then widely promoted by Karl Friston in the 2000s within systems neuroscience.

Friston is a fascinating figure in science. He has both incredibly original ideas and a tendency to obscure them behind dense, often cryptic mathematics. Sometimes, when reading his equations, it takes a moment to realize they're actually familiar concepts—just expressed in very unusual formalisms. And in a way, that’s reflective of the theory itself.

There are many compelling ideas in the original formulation of predictive coding. But when it comes to making the theory precise enough to generate specific, testable predictions, it becomes quite difficult. That’s the challenge—translating these broad concepts into empirically grounded models.

That said, many of the general ideas are genuinely interesting. In AI, and in predictive coding, one central idea is that driving a system to minimize its prediction error can be a sufficient principle for intelligence to emerge. The notion is that by learning to predict the world, a system will build useful intermediate representations. This idea is at the heart of the theory.

But why this process is sufficient—why minimizing prediction error leads to intelligent representations—remains unclear. We see this optimization happening in our algorithms, where we can control and understand it. But presumably, something similar is happening in the brain. And yet, the fact that this alone might give rise to intelligent processing is still something we don’t fully understand. It may not be a necessary condition, but it increasingly seems to be a sufficient one.

So, to answer your question—I don’t have a favorite theory. Like many of my peers, I’m more interested in exploring these large, sometimes unwieldy theories to see if they contain missing pieces—concepts that could help us better understand how the brain actually works.

The Decoder: In one of your earlier papers, you wrote that word sequences – the order of individual words – quickly become unpredictable, while their meaning may remain more stable. You suggest that for an intelligent system, it might be important not only to predict the next words, but to anticipate more abstract, hierarchical representations over longer timescales. I'm curious: Have you gained any new insights into this in your recent research – especially with regard to other modalities like images or videos, where similar challenges arise?

Jean Rémi King: What became apparent to us after publishing that paper—and I think this still holds true today—is that it’s not enough to simply predict what’s going to happen next, right after an observation. It's equally important to predict what will happen much later, well beyond the immediate moment. That’s a valuable goal, but in practice, it's extremely difficult.

Even today’s models that can handle multi-token prediction don’t scale particularly well. Building a model that can generate an entire paragraph or page at once is still incredibly challenging. This kind of long-range prediction just isn’t something current systems do easily.

My strong belief is that this is a genuinely hard problem: figuring out what kinds of architectures can support long-range inference in latent space. The classic transformer, as we have it today, remains limited in that regard.

Within our group, we've decided to take a step back from trying to invent those architectures ourselves—largely because so many teams are already working on this problem purely from an AI perspective. It seems unlikely that a breakthrough architecture will come directly from a neuroscience lab. However, we still collaborate with others working on adjacent challenges.

For example, at FAIR we have a team focused on computer vision for video. There, too, the goal isn’t just to predict the next video frame but to anticipate what might happen 10 seconds or even a minute later. That’s a massive challenge from a computer science standpoint.

We also have people working on code generation. In that context, it’s not useful to just predict the next character in a line of code. Ideally, you'd want a model to generate a full structure—say, a set of functions, which call classes, which interact with a dataset. Simply predicting the next token often isn't the best way to reason through that process.

So while we've explored these ideas, I wouldn't say we've solved anything. What we've mostly learned is just how difficult this problem truly is.