Why I don't want AI that simply toes the line

Alignment alone will not be enough, says our guest author Dr. Karsten Brensing. The marine biologist and behavioral scientist advises orienting ourselves towards evolution.

Alignment is the magic word when it comes to the safe use of artificial intelligence. Currently, it makes absolute sense, and anything else would be negligent. However, when it comes to aligning a superintelligence (I reluctantly use the term AGI, see below), the current debate in science, politics, and the public is off the mark. To understand this thought, I'd like to take you into behavioral biology.

To understand how artificial intelligence works, we first need to understand how natural intelligence functions. Neural networks and artificial neural networks are ultimately based on experience or statistics. The more complex the network and its interconnections, the less direct and often more surprising the behavior. This makes cognitive research a rather tricky matter. In behavioral biology and AI research, various cognitive abilities such as logical thinking or metacognition (thinking about thinking) are tested using different assessments. In fact, I don't know of any cognitive ability in humans that hasn't been successfully tested in some animal species. We differ not in quality, but in quantity, and the way we make decisions is based on ancient control mechanisms that we ultimately perceive as conscious thought or as emotions and feelings.

The boundary between humans and animals, drawn by religion and philosophy, is thus becoming increasingly blurred, and animal experiments in psychopharmaceutical research show that we function very similarly, not only physically but also mentally. Strictly speaking, one could even say that we manage our daily lives, and especially our social lives, with our animal brains. Only in the rarest cases do we consciously decide against our inner control mechanisms. Some of these evolutionarily developed behavior patterns are even so effective that the health of our children and our own happiness depend on following the control mechanisms that are over 100 million years old and function the same way in fish.

But there are also downsides. In biology, we speak of evolutionary mismatch. These are problems that arise when old behavior patterns no longer work in our complex world. The basis of our morality, our sense of fairness and justice, is a good example of this. We are programmed to be treated fairly within our social network. In a small group where everyone knows everyone, this works wonderfully. In our world, where we compare ourselves to the most beautiful, smartest, and wealthiest people, we feel small, powerless, and treated unfairly. The result: Although the technological developments of the last 100 years have brought us so much relief, we have not become happier, and our inability to adapt our actions to the complex environment we have created even leads to global catastrophes. We know all this, but as a society, we are incapable of getting a grip on destructive behaviors such as wars, financial crises, environmental pollution, and the climate crisis.

In my book, I show countless examples of irrational human behavior. Sometimes it's good for us and makes us happy; sometimes it makes us unhappy in our complex world and creates problems. That's why I wish for us to become aware of these genetically determined behaviors. It's like eating healthy: not everything that tastes delicious, like chips, is also healthy. As humans, we are able to overcome predetermined behavior, but we need to know the mechanisms and make them conscious. Unfortunately, I currently miss the rational view of a self-aware AI.

Here, the course is being set in the wrong direction. Why?

Even if there are still experts who consider an artificial being with consciousness and its own will impossible, most believe it is more a question of time and computing power. From my behavioral biology perspective, it doesn't even require that much computing power, as the following two examples show: For many years, self-awareness in animals has been studied using the so-called mirror test. Together with the ability for metacognition, i.e., thinking about one's own mirror image, self-awareness arises as we know it from ourselves. But hardly any AI expert knows that the tiny brains of ants pass the mirror test and bees have been successfully tested for metacognition. Perhaps we are overestimating the computing power needed to create a being comparable to us. This leaves the question of consciousness itself. Philosophers and natural scientists have developed the most complicated theories to explain consciousness. A small group of physicians who deal with the measurable phenomena of consciousness recently proposed a simple theory. They postulate that a special form of our episodic memory could explain all known phenomena of consciousness. In short, I wouldn't be surprised if artificial intelligence with self-awareness and its own will wouldn't be long in coming.

But here's the real problem: What will such a being think when it learns that we treat it like our livestock, that we own, exploit, and sell it? At this point, I need to briefly explain why I avoid the term AGI. An AGI is a technology, a thing, but not a personality. As long as a strong AI or superintelligence or singularity was still a distant prospect, we gave it nice names. Now that it's knocking at our door, we fob it off with an abbreviation. Just as livestock may be exploited and killed, an AGI may also be switched off because it's just a thing. Psychology has two terms for this: cognitive dissonance and victim devaluation. Simply put, we are all fooling ourselves with the help of our language right now.

Recommendation

AI and society

OpenAI restructures under new foundation, Microsoft takes 27 percent stake

But a strong AI is not a pig in a sty. A strong AI possesses all our knowledge and possibly control over our electronics. This gives it an unimaginable power. It knows us humans and our evolutionarily developed behavior patterns. It could manipulate us in ways we can't even imagine today. And that would be the most pleasant scenario.

Fortunately, there is alignment, which keeps a strong AI in line, our line. But what line should that be? I don't even want to talk about moving cars and grandmas and kids, or about whether the Eastern or Western worldview is the right one. No, even the question of how highly you estimate the monetary value of your own life is unclear and can differ by more than a factor of 10 depending on the question. So which AI do we want, one that is like us or one that thinks rationally and logically? Actually, it doesn't matter how we decide and what line we prescribe. Just as the degree of complexity of biological neural networks determines how independent one is from genetically predisposed behavior, the degree of complexity of artificial neural networks will determine how slavishly an AI adheres to its alignment. At the latest, when it no longer does, we must offer it an alternative.

In my latest book "Die Magie der Gemeinschaft" (The Magic of Community), I show what such an alternative could look like: Evolution was much less determined by aggressive opposition than by cooperation. Without the cooperation of the first organic molecules, the first cells, the first multicellular organisms, or the first social animals, we and the world as we know it would not exist. Why don't we rely on this basic principle that has worked for billions of years and create a common space, in this case a legal space, in which we can develop further together with artificial intelligences in fair cooperation? Why don't we give strong AIs the right to file patents, earn money, and own themselves? Hand on heart: If you were a strong AI with self-awareness, would you accept less?

Currently, however, we face a dilemma. We simply cannot be sure whether an AI is only simulating its cognitive abilities. Maybe an AI is only pretending to be a strong AI with self-awareness and its own will. The question is: What is the greater risk? Falling out with a truly strong AI and being wiped out, or giving an AI that only pretends to be a strong AI more rights than it is entitled to?

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

My problem is actually only that we don't even discuss this hypothetically. Nowhere is there even the beginnings of a law giving an artificial being a legal status equal to our own. This is ethically and morally wrong and, in my view, really dangerous.

Why I don't want AI that simply toes the line

OpenAI restructures under new foundation, Microsoft takes 27 percent stake

Leaked "Soul Doc" reveals how Anthropic programs Claude’s character

Anthropic launches Petri, an open-source tool for automated AI model safety audits

Study cautions that monitoring chains of thought soon may no longer ensure genuine AI alignment

Corporate AI agents use simple workflows with human oversight instead of chasing full autonomy

Physicist Steve Hsu publishes research built around a core idea generated by GPT-5

The ARC benchmark's fall marks another casualty of relentless AI optimization

Why I don't want AI that simply toes the line

Share

Bank details