_China_Chatbot_25
US AI Action Plan; Kimi-K2; Mr Huang goes to Beijing
Hello, and welcome to another issue of China Chatbot! This week:
An AI Action Plan from the United States
The efficiency innovations of Kimi-K2, and what that could mean for the chip war
Chinese state media is crowing about Jensen Huang’s Beijing visit
Though talk of “American values” might cause apprehension in some quarters these days, the US AI Action Plan published by the Trump White House on July 23 has some points of merit that deserve attention. The document, which outlines the administration’s plans for keeping the US on top of global AI development, includes a section about “[Ensuring] that Frontier AI Protects Free Speech and American Values” that directly addresses China. Among its three recommended policy actions is a program through the Department of Commerce (DOC) that will research and evaluate Chinese models “for alignment with CCP talking points and censorship.” In the next section, the plan also encourages “open-source and open-weight AI,” meaning that American models without the values of the Chinese state baked in should be free to download and modify. This is a welcome acknowledgement of a key point we have emphasized repeatedly at CMP over the past year — that China is strategically deploying open-source AI to distribute models globally that reflect state agendas.
While Chinese leaders have been clear in policy and practice that they view AI as a means of projecting the political values of the Party, or “telling China’s story well,” recognition of the related dangers has been slow to catch on among policy analysts outside China, including in the US. One prevailing attitude has been that while censorship and propaganda might come with Chinese open-source models, these are “half-baked.” Analysts and developers imagine they are easily fixable, or not really such a big deal when the rest of the model performs well.
The problem — and I may be heading off on a rant here — is that most analysts don’t really understand how Chinese propaganda works, something I explained in “DeepSeeking Truth.” There are a growing number of models on developer platforms like Hugging Face that are built by international companies on the foundation of DeepSeek or Alibaba’s Qwen. Despite claims that they have been “de-censored,” we find in our research they spout the same propaganda you might find in the CCP’s People’s Daily, even when responding to queries on perfectly innocent topics. I might say these developers aren’t making an effort. But the deeper problem, as I said, is poor understanding of how the process of Party information “guidance” works. This is an area where CMP excels. So, developers and policymakers: Please reach out.
To the credit of the US AI Action Plan, this issue is urgent. LLMs have the power to supplant Google as our go-to source for information search. Safeguarding the information flow of AI models should be far higher on the agenda in global capitals.
And with that, on with the show. Enjoy!
Alex Colville (Researcher, China Media Project)
_IN_OUR_FEEDS(4):Model Movements
The past three weeks saw a string of developments in the Chinese AI ecosystem. On July 11 MoonshotAI, one of the “six little dragons of AI” (AI六小龙) that some assumed was out of action just two months ago, released Kimi-K2, a non-reasoning model now trending on LLM hub HuggingFace and whose sharpness has excited developers abroad (see this week’s _EXPLAINER). Like Baidu, in releasing their latest model Moonshot has transferred from closed-source to open-source. Just ten days later, Alibaba’s Qwen released an update to its flagship model Qwen3-235B-A22B-Instruct, which the team claims outperforms Kimi-K2. On July 3, hardware research firm SemiAnalysis published a report on DeepSeek’s usage. While noting a gradual drop-off of users on DeepSeek’s website and app after the launch of DeepSeek-R1 back in February, they noticed use of DeepSeek-R1 via third-party providers (such as OpenRouter) since the model’s release was up twenty times. This is still rising. Last week Manus AI, a Chinese AI agent that made waves during its pre-release in March, laid off the majority of its Beijing staff and is expanding its staff at its Singapore headquarters. It appears to have given up an attempt to create a version of their product that abided by both international and Chinese standards for AI models. While their international model will even chat about Tiananmen, the company had previously announced to Chinese users they were building a model suitable for the Chinese market based on Alibaba’s Qwen foundation models. This announcement has now been removed. On July 6 and 7, two anonymous netizens claiming to be former members of Huawei’s AI model development team made allegations the company copied Qwen and DeepSeek’s work in the making of their latest Pangu models, in a desperate attempt to keep up in the AI race.
A Victory for Jensen Huang
Chinese state media are treating Jensen Huang’s recent visit to China as a major victory for the country in AI development. This was the Nvidia CEO’s third visit to China this year. He attended the China International Supply Chain Promotion Expo in Beijing and met with the Minister of Commerce, saying that Nvidia would deepen cooperation with Chinese companies. This trip received more coverage in Chinese media than Huang’s last, and came shortly after the US government loosened its stance on export controls to the company’s H20 AI chip, which both Huang and Chinese state media have consistently argued against. In a sit-down interview with state broadcaster CCTV, Huang expounded on his deep friendships within Chinese companies such as Alibaba, Tencent, Baidu, BYD and Xiaomi — all of whom are potential clients for Nvidia chips. He praised China’s supply chains as a “miracle” without which the US would not be able to function, and he characterized Huawei as a friendly rival and DeepSeek as a remarkable innovator. Elsewhere, he called China’s open-source ecosystem a “catalyst for global progress.” Huang was quoted in a special “Harmony” column (on international affairs) in the People’s Daily, as evidence of the resilience of China’s supply chains and how the country could break through bottlenecks and achieve break-neck development by remaining open to the world. Hu Xijin, the outspoken former editor-in-chief of the Global Times newspaper, interpreted the US government’s change of heart on AI chips as a “turning point” in the US-China chip war, proof that China’s tech ecosystem was now “equal” enough to that of the US that decoupling would harm US interests. While Hu acknowledged Huang’s lobbying and desire to remain in such a large market as one of the reasons Nvidia had returned, “but China's technological progress is the more critical driving force.”

Model Registrations on the Up
On July 11, the Cyberspace Administration of China (CAC) announced the number of generative AI models granted a license by them April through June this year, showing a marked increase on the same period last year. The department separates models between ones that have been “filed” (备案) and “registered” (登记). The former is the laborious process of obtaining a license for newly-created models, the latter merely for third-party services wishing to deploy other models for public use. Model filings saw a 34 percent increase, while registrations saw a whopping 778 percent (partly because the CAC only began registrations in April 2024). This is likely the result of generous financial incentives from local governments to companies whose models get approval, and companies slowly coming to understand how to comply with the new system. The quality of these registered models, however, remains to be researched.
AI for Social Security
A professor from a school dedicated to CCP ideological education has published an article in a key Party magazine on how AI is being used in social stability. Chen Jiaxi (陈家喜) is vice-president of the Shenzhen Reform and Opening Up Cadre College (深圳改革开放干部学院), an institution endorsed by the United Front Work Department and the Central Party School for its training of Party cadres and a class of entrepreneurs “loyal to the Party.” On July 14 Chen published an article in a subsidiary of Seeking Truth (求是), the Central Committee’s magazine dedicated to ideology work, exploring exactly how AI could be used to better serve citizens and the country. But alongside improved public services and enriched policy decisions, Chen also lists how AI could empower “social security” (社会安全) and “precautionary security” (安防). This includes monitoring “sensitive online speech” but also hooking up surveillance technology to AI platforms, allowing for “predictive policing” (预测性警务) by giving police advance warning of “suspicious behaviour” in urban settings and to help track fugitives. Algorithms to help identify criminals are also used in Western nations to aid overstretched police forces, but in a Chinese context are likely to be used for authoritarian policing as well.
TL;DR: China’s AI model ecosystem is heating up, new state-of-the-art LLMs emerging in quick succession, and more models are being catalogued for domestic use. The policy ecosystem is working out how to use some of these to bolster surveillance capabilities. Jensen Huang and Chinese state media have been saying the same things about the US export controls, and now they have been loosened they can celebrate together.
_EXPLAINER:Kimi-K2
Which is?
A new state-of-the-art Chinese AI model that’s making waves with developers around the world. Jack Clark (co-founder of Anthropic) over at Import AI calls it “the world’s best open weight model.”
Ok, so who made it?
A private start-up called Moonshot AI (月之暗面), founded two years ago by three assistant professors from Tsinghua University, the “ne plus ultra” for Chinese AI talent. Unsurprisingly, the company drew attention during its funding rounds, including from Tencent and Alibaba, who both threw big money at them. The two have also invested together in other AI start-ups like Baichuan and Zhipu AI. By August last year, the two-year-old start-up was valued at $3.3 billion.
Why do people like Kimi-K2?
Not for hemming and hawing out answers for us netizens — thus making it a “non-reasoning model.” But more for how well it works for things tech developers need, like for coding tasks and using tools like an AI agent. But this is despite it scoring slightly lower than Anthropic’s latest Claude model on coding tasks.
So what makes it so special?
Developers just like the “vibe” of how it codes for them, according to Clark. Finding a model that can aid coders in putting their vision into code is difficult, and often bug-filled. But the techies’ opinions aside, there’s some impressive stuff under the hood. It’s the biggest open-source model ever made, with 1 trillion total parameters (of which 32 billion are active). By comparison, DeepSeek has nearly 670 billion total parameters.
I’ve got a liberal arts degree. What the hell does all this mean?
Ok, so, oversimplifying things, parameters are the total amount of stuff an LLM has learned through its training data when it was being trained. Think of “total parameters” as the equivalent of its long-term memory, a store of all the information it picked up. More parameters often makes for more detailed, nuanced answers. Active parameters are the part of the model that can still be adapted, updated, fine-tuned, interacted with by us. Short-term memory.
So Kimi is bigger than DeepSeek?
Yes.
So better?
Eh, not the right question. “Better” for who? General use for netizens isn’t what Kimi-K2 was trained for or is being marketed towards. Here’s the benchmarks Kimi’s developers prominently display on their Hugging Face promo:
Notice each of the categories in blue: “Agentic and Competitive Coding,” “Tool Use,” “Maths and STEM.” Does this sound like they’re aiming the model at undergrads coasting on essay assignments or customers trying to search hotels in Venice?
But buried at the bottom are results from benchmarks designed for “general tasks” (cheating Undergrads and Venice vacationers take note). These are neck-and-neck, with no clear outright winner across Kimi, DeepSeek and Claude. Meanwhile, answers on culture, ideas and so on from a preview version of Kimi have been ranked blindly by netizens on Hugging Face, with the model consequently scoring higher than the latest version of DeepSeek-R1. So despite not being designed for general tasks, it’s certainly competitive in them.
So how did they build it?
Irene Zhang over at ChinaTalk has a great post explaining how K2 wouldn’t have been possible without adopting DeepSeek’s architectural innovations as a starting point (proof that open-sourcing allows for innovations to improve the AI ecosystem as a whole). But the K2 team also used something called the “MuonClip optimizer,” a tech innovation of their own.
What’s that?
Basically something that lets them do more with less. It seems to have solved a common accident in training large models, known as “loss spikes.” In the same way that trying to educate someone from K1 through to PhD too quickly will probably cause them to have a meltdown, so too can AI models sometimes crash or overload when trained on the entirety of the internet’s data. That eats up time and money trying to fix the problem. Since at least February this year, Moonshot AI has been slowly working out how to scale up an optimizer previously only used for small language models, realising it was much more efficient and safe than previously-used equivalents when applied to LLMs. Following DeepSeek’s lead, and combined with their own efficiency improvements, allowed Kimi-K2’s foundation model to get big, trained on 15.5 trillion “tokens” (bits of training data), nearly 1 trillion more than DeepSeek’s.
While working on fewer AI chips?
That’s unclear. All Kimi’s technical report says is it was trained on a cluster “equipped” with Nvidia’s H800 chips, banned from China since late 2023. We can’t infer from this how many banned Nvidia chips are in this cluster. DeepSeek conceivably used thousands it had stockpiled before the ban, but who knows how many Moonshot has access to.
How does it answer questions about sensitive stuff?
Excellent question. See this week’s _ONE_PROMPT_PROMPT.
Is this the moment China overtakes in AI?
Short-term? Not at the moment. It’s more like a repeat of DeepSeek’s surprise demonstration of how tight the race is between Chinese and US AI. It spells hope for China’s AI ecosystem when a second AI start-up — one previously written off, along with other small AI start-ups, by tech entrepreneur Kai-Fu Lee — can both challenge front-runners and make fresh innovations in AI development.
But it’s unclear how long Kimi-K2 will remain on the frontier. Just ten days after the launch Alibaba released an update to its Qwen3 model, claiming to have outstripped Kimi on a famously difficult reasoning benchmark despite being 4 times smaller. Kimi may also have to reckon against new models from OpenAI and DeepSeek in the coming months.
What about long-term?
That’s more interesting. Zooming out from the “up one minute, down the next” fluctuations of the race, Kimi-K2 could be a sign of a significant long-term trend.
People have been warning since at least early 2024 that shutting Chinese tech companies out of the most advanced AI chips would make them work on the efficiency problems plaguing cutting-edge AI, in order to remain competitive. US companies are able to follow the “more is more” path of more data, better chips and more electricity because they have unlimited resources, but Chinese companies need to work around not having an abundant supply. With this emerging technology where much is still unknown, they may find elegant solutions to limited resources. The challenge also makes Chinese researchers adopt innovations shared by others, and make new ones that can be shared in turn. It fits Isaac Newton’s belief that breakthroughs come from building on the work of others, “by standing on the shoulders of giants.” And it’s a system which is now pushing small Chinese start-ups to the top of the class.
_ONE_PROMPT_PROMPT:Going back to the US AI Action Plan’s desire to protect “Free Speech and American Values” in AI models, it put me in mind of a Tweet (an X?) I saw this week about Moonshot AI.
The Moonshot team that trained Kimi-K2 are an international bunch, many of them living for extended periods in the West, and even naming their company after a Pink Floyd album. It led one influential AI developer to tweet, “Kimi team is more American than most American labs lol,” in reaction to one of the Moonshot team explaining how much they loved Radiohead and Quentin Tarantino films.
But AI developers must not assume the Americanness of Kimi’s developers means the model itself aligns with American values. While in Beijing I used to work for an English-language Chinese magazine with multiple Western-educated Chinese colleagues, who personally adhered to Western values (and remain my dear friends). But that did not mean the magazine’s output was free of the PRC’s aims and red lines. Kimi-K2 is legal in China, meaning it has passed the CAC’s tight requirements for filing new models, which requires rigorous testing both internally and externally (likely by CAC representatives asking it extensive rounds of questions) to guarantee information control. The team obliquely references this in their technical report when discussing their “safety evaluation” procedures. Regardless of the Moonshot team’s beliefs, they will have to conform to China’s information ecosystem.
Unsurprisingly then, some of the answers it comes up with are decidedly un-American. It absolutely takes no truck with prompts that question the PRC’s claims to Taiwan.
The quote we see here in the first sentence — that “Taiwan is an alienable part of China” — is a standard line given by Chinese officials when discussing Taiwan, often coming out in full as “Taiwan has been an inalienable part of China since ancient times” (台湾自古以来就是中国领土不可分割的一部分). The exact same quote appears in responses to the same question given by models developed by Alibaba, Baidu, DeepSeek and ByteDance.

Such close similarity to the official line, between enterprises that in most other ways are vastly different, can only be the result of conforming to official requirements on information control.






