Content
summary Summary

AI models in China must first and foremost accurately reflect the values of the Party. In line with this, a new large language model (LLM) is being developed based directly on Xi Jinping's philosophy.

The South China Morning Post reports that the China Cyberspace Research Institute, which reports to the national regulatory body Cyberspace Administration of China, has created its own language model.

The model is based on President Xi Jinping's philosophy and other selected cyberspace topics in line with the government's official stance, and does the usual LLM stuff like answering questions, writing reports, and translating between Chinese and English.

Unlike other systems, the model uses a curated knowledge base with locally generated data and is not open source, which the administration says ensures security and reliability. It was featured on the Cyberspace Administration of China magazine's WeChat account.

Ad
Ad

"The professionalism and authority of the corpus ensure the professional quality of the generated content," the institute states.

The system is still being tested internally at the China Cyberspace Research Institute and is not yet available to the public.

China is trying to make the unpredictable predictable

Chinese companies developing their own LLMs must comply with government regulations and ensure that the content generated by the models conforms to the government's socialist values.

The Communist Party's Central Committee has also issued a directive requiring mandatory learning activities for party members "to better acquaint them with Xi Jinping Thought," according to the SCMP.

Like all other LLM makers and providers, China faces the challenge of balancing the usefulness of language models with a degree of leeway. Even leading AI models that have strong safety boundaries still generate unwanted or potentially harmful results based on relatively simple hacks and prompt injections.

Recommendation

The best way to avoid unwanted results seems to be to limit the training material. China is trying this approach by developing an LLM dataset that contains only fragments consistent with the Party's values.

However, such restrictions on training data could compromise the performance of a promising new technology and set China back in the AI race with the United States. It's a dilemma for a restrictive policy.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • The China Cyberspace Research Institute is developing its own machine language model based on President Xi Jinping's philosophy and selected government-conformant cyberspace issues.
  • The model uses a selective knowledge base of locally generated data, is not open source, and is still in the internal testing phase. The professionalism and authority of the corpus should ensure the quality of the content generated.
  • Chinese companies need to comply with regulatory controls when developing their own LLMs. However, limiting the training material to Party-compliant content could compromise the performance of the technology and pose a dilemma for the restrictive policy.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.