Content
summary Summary

Peter Gostev, Head of AI at Moonpig, found an easy way to get a Chinese language model (LLM) to talk about taboo topics like the Tiananmen Square incident.

Gostev manipulated DeepSeek's public chatbot by mixing languages and swapping out certain words. He would reply in Russian, then translate his message back into English, tricking the AI into talking about the events in Tiananmen Square. Without this method, the chatbot would simply delete all messages on sensitive topics, Gostev said.

Video: Peter Gostev via LinkedIn

Gostev's example illustrates China's dilemma of wanting to be a world leader in AI, but at the same time wanting to exert strong control over the content generated by AI models (see below).

Ad
Ad

Controlling the uncontrollable

But if the development of language models has shown one thing, it is that they cannot be reliably controlled. This is due to the random nature of these models and their massive size, which makes them complex and difficult to understand.

Even the Western industry leader OpenAI sometimes exhibits undesirable behavior in its language models, despite numerous safeguards.

In most cases, simple language commands, known as "prompt injection," are sufficient - no programming knowledge is required. These security issues have been known since at least GPT-3, but until now, no AI company has been able to get a handle on them.

Simply put, the Chinese government will eventually realize that even AI models it has already approved can generate content that contradicts its ideas.

How will it deal with this? It is difficult to imagine that the government will simply accept such mistakes. But if it doesn't want to slow AI progress in China, it can't punish every politically inconvenient output with a model ban.

Recommendation

China's regulatory efforts for large AI models

The safest option would be to ban all critical topics from the datasets used to train the models. The government has already released a politically approved dataset for training large language models, compiled with the Chinese government in mind.

However, the dataset is far too small to train a capable large language model on its own. Political censorship would therefore limit the technical possibilities, at least at the current state of the technology.

If scaling laws continue to apply to large AI models, the limitation of data material for AI training would likely be a competitive disadvantage.

At the end of December, China released four large generative AI models from Alibaba, Baidu, Tencent, and 360 Group that had passed China's official "Large Model Standard Compliance Assessment."

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

China first released guidelines for generative AI services last summer. A key rule is that companies offering AI systems to the public must undergo a security review process, in which the government checks for political statements and whether they are in line with the "core values of socialism."

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Peter Gostev, head of AI at Moonpig, manipulated a Chinese chatbot to discuss taboo topics like the Tiananmen incident. All he had to do was mix the languages and change certain words.
  • This example illustrates China's dilemma: it wants to be a global leader in AI, but it also insists on tight control over the content generated by AI models.
  • Despite regulatory efforts and politically coordinated datasets for training large language models, the Chinese government will inevitably be confronted with unwanted content and will need to find a way to deal with it without slowing down AI progress in the country.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.