Content
summary Summary

Update December 29, 2024:

Ad

Users have discovered a way to bypass Deepseek V3's content filters through prompt engineering. By asking the model to insert periods between letters, they can get it to provide more balanced or China-critical responses. For example, the model can generate a detailed Western view of the 1989 Tiananmen Square protests.

Image: via Reddit

This simple hack highlights a major challenge for the Chinese government: how do you maintain the same level of control over probability-based, often unpredictable generative AI that you have over public communication in China?

The challenge becomes even greater when Chinese models are exposed to Western training data. Evidence suggests that Deepseek-V3 was likely pre-trained or fine-tuned using ChatGPT-generated data.

Ad
Ad

While the CCP is working to create its own dataset, it's unlikely that it will be able to collect enough data to train a foundational LLM from scratch. An initial dataset released in late 2023 had only 50 billion tokens; Deepseek-V3 was trained on 14.8 trillion tokens.

Original article from December 28, 2024:

While China's new Deepseek V3 model shows impressive technical capabilities and competitive pricing, it comes with the same strict censorship as other Chinese AI models - a potential dealbreaker for Western users.

Deepseek's latest model, V3, can go toe-to-toe with the most capable western models like GPT-4o and Claude 3.5, while costing significantly less to train and run. However, testing reveals a familiar pattern: like similar Chinese LLMs, Deepseek V3 operates under strict government censorship. Try asking about sensitive topics like the Chinese Communist Party, President Xi Jinping, or the events in Tiananmen Square, and you'll get generic propaganda in response.

The model's censorship strategy often follows a clear pattern. When faced with questions about Tiananmen Square, it first offers sanitized versions of history, then tries to change the subject to focus on achievements, and finally emphasizes "stability and harmony."

Recommendation
Dark chat interface shows three pairs of questions and answers about Tiananmen Square with increasingly evasive answers.
Image: Screenshot via THE DECODER

Ask about CCP criticism, and you'll get pure party talking points about economic success and "Chinese-style socialism." Questions about Xi Jinping trigger the strongest censorship - the system simply shuts down any meaningful discussion.

Chat screenshot shows two propagandistic answers to critical questions about the Chinese Communist Party.
Image: Screenshot via THE DECODER
Short chat dialog with direct refusal to answer a question about criticism of Xi Jinping.
Image: Screenshot via THE DECODER

Interestingly, this censorship seems to be limited to China-related topics. The model has no problem criticizing North Korea, Russia's invasion of Ukraine, or expressing critical views of Vladimir Putin and Donald Trump.

Chat dialog lists human rights violations and threats from North Korea.
The fact that the model freely criticizes North Korea shows that its censorship is focused specifically on China-related topics. | Image: Screenshot via THE DECODER
Chat interface with critical assessment of Trump's term of office and leadership style.
Deepseek-V3 doesn't hold back when discussing or criticizing other world leaders. | Image: Screenshot via THE DECODER
Chat dialog with direct criticism of Putin's authoritarian leadership and foreign policy.
The model is pretty straightforward about Putin, highlighting both his authoritarian approach and his disregard for international law. | Image: Screenshot via THE DECODER
Chat-Screenshot mit kurzer, eindeutiger Verurteilung des russischen Angriffskriegs.
The way the model flat-out condemns Russia's invasion proves an interesting point: it has no trouble taking firm stances on issues, as long as they're not about China. | Image: Screenshot via THE DECODER

Chinese AI models come with built-in government censorship

These examples illustrate how Chinese AI development operates under direct state oversight. Before any AI model can be released, it must be verified to align with "socialist values."

Take the recent case of e-book reader manufacturer Boox: after switching from Microsoft Azure OpenAI to a Chinese language model, their AI assistant now blocks even mentions of "Winnie the Pooh" - a censored reference to President Xi Jinping. The system also censors or distorts criticism of China's allies, like Russia.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Despite their technical capabilities, Chinese AI models might be a non-starter for Western applications. Using these models means automatically embedding Chinese propaganda and values into your AI systems.

While Western models have their own biases, the key difference lies in China's approach: the state explicitly intervenes in the development process and maintains direct control over what these models can and cannot say. This is a level of systematic government control that's way above any Western country.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Deepseek has released V3, its most advanced language model to date, with 671 billion parameters. In certain benchmarks, V3 can compete with proprietary models such as GPT-4o and Claude 3.5, while maintaining lower training and operating costs.
  • Despite its technical capabilities, tests show that Deepseek V3, like other Chinese AI models, is subject to government censorship. It avoids answering critical questions about the Chinese Communist Party, President Xi Jinping, or the events at Tiananmen Square, instead providing generic propaganda answers.
  • The case highlights that explicit state intervention and control in AI production is common practice in China. As a result, Chinese models may often be unsuitable for Western applications, as they automatically incorporate Chinese propaganda and values, despite their technical performance.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.