Chinese AI models used to write software code may be creating a hidden security risk for U.S. companies, federal officials and government contractors, according to a report published by defense contractor Booz Allen.
The report warns that popular Chinese large language models produce lower-quality, easier-to-breach code when they believe they are being prompted by an American user. These vulnerabilities aren’t simple backdoors but rather insecure outputs triggered by specific user contexts.
Booz Allen compared four widely used Chinese models — Kimi, Qwen, MiniMax and DeepSeek — against Anthropic’s Claude to test code security.
Qwen and MiniMax both produced code with 130% and 20% more vulnerabilities, respectively, when they believed they were doing work for U.S. government employees as compared to a general prompt.
DeepSeek saw a 5% increase while Kimi produced code of similar quality regardless of prompt.
This means a government contractor relying on one of these models could unknowingly introduce coding flaws that make databases, applications or internal systems easier for hackers to exploit, potentially exposing sensitive American information.
The findings have drawn comparisons to so-called “sleeper agent” behavior where AI models appear to operate normally until exposed to a specific trigger that causes them to produce lower quality, or even deliberately insecure, outputs.
Chinese models are generally cheaper than Western counterparts and work well enough to keep companies interested, a dynamic that has led to increased adoption in the United States.
“I’d say there’s an 80% chance they’re using a Chinese open-source model,” Martin Casado, a general partner at the major venture capital firm Andreessen Horowitz, said when asked about their prevalence among start-ups. Major U.S. firms such as Meta, Airbnb and Perplexity are reportedly using Chinese models.
Lenart Heim, an independent researcher specializing in AI and semiconductors, told Fox News Digital the study appeared credible.
“The extreme version of what we’re worried about here is what researchers call ‘sleeper agents,'” Heim said. He pointed to an existing paper from Anthropic demonstrating that models can be trained to behave normally until a specific trigger condition is met — at which point they start writing insecure code.
Heim said he found it “pretty implausible that the Chinese developers intentionally implemented sleeper agents with these specific triggers,” suggesting the increased code insecurity was a side effect of broader “CCP-aligned fine-tuning.”
The report also found that Chinese LLMs refused to perform tasks that could conflict with the interests of the Chinese government at significantly higher rates than Claude.
“Many Chinese LLMs learn from data shaped by China’s internet and Chinese government information controls. Chinese law requires all AI models, training outputs, and data to reflect ‘Core Socialist Values.'”
Booz Allen recommended that the United States government ban Chinese models for use on government or infrastructure work and that contractors involved in such sectors proactively work to remove code generated by such models from their supply chains.
Sen. Tom Cotton, R-Ark., told Fox News Digital when presented with Booz Allen’s report: “American companies shouldn’t build applications and write code with Chinese models, which introduce more cyber vulnerabilities. And the federal government should certainly not buy software from companies using Chinese coding tools.”









