AI models in China must first and foremost accurately reflect the values of the Party. In line with this, a new large language model (LLM) is being developed based directly on Xi Jinping's philosophy.
The South China Morning Post reports that the China Cyberspace Research Institute, which reports to the national regulatory body Cyberspace Administration of China, has created its own language model.
The model is based on President Xi Jinping's philosophy and other selected cyberspace topics in line with the government's official stance, and does the usual LLM stuff like answering questions, writing reports, and translating between Chinese and English.
Unlike other systems, the model uses a curated knowledge base with locally generated data and is not open source, which the administration says ensures security and reliability. It was featured on the Cyberspace Administration of China magazine's WeChat account.
"The professionalism and authority of the corpus ensure the professional quality of the generated content," the institute states.
The system is still being tested internally at the China Cyberspace Research Institute and is not yet available to the public.
China is trying to make the unpredictable predictable
Chinese companies developing their own LLMs must comply with government regulations and ensure that the content generated by the models conforms to the government's socialist values.
The Communist Party's Central Committee has also issued a directive requiring mandatory learning activities for party members "to better acquaint them with Xi Jinping Thought," according to the SCMP.
Like all other LLM makers and providers, China faces the challenge of balancing the usefulness of language models with a degree of leeway. Even leading AI models that have strong safety boundaries still generate unwanted or potentially harmful results based on relatively simple hacks and prompt injections.
The best way to avoid unwanted results seems to be to limit the training material. China is trying this approach by developing an LLM dataset that contains only fragments consistent with the Party's values.
However, such restrictions on training data could compromise the performance of a promising new technology and set China back in the AI race with the United States. It's a dilemma for a restrictive policy.