Open weights

Washington built a machine to gate frontier AI. The capability it fears just shipped as a free download.

GLM-5.2's cyber results don't prove a Chinese lab caught the frontier. They prove the thing the U.S. is trying to contain no longer lives only in models it can switch off.

Amy Mercer·Jun 30·7 min

Image: Carl Lender / Flickr (CC BY 2.0)

Last week a model that anyone can download was handed a piece of real, open-source software and a single instruction: find the access-control bugs. It was given no special tooling, no scaffolding, no purpose-built pipeline — a prompt and nothing else. On that one task it found the flaws at a higher rate than a frontier commercial model running inside a system built specifically to hunt them. The downloadable model is Chinese, open-weight, and free to copy. That is the fact worth sitting with, and it is worth being precise about exactly what it does and does not mean.

The model is GLM-5.2, released in mid-June by the Beijing lab Z.ai under an MIT license — the most permissive there is. The benchmark comes from Semgrep, a code-security firm, which tested how well various models detect a class of vulnerability called an insecure direct object reference: the kind of bug where an app trusts a user-supplied identifier and hands over data it shouldn't. Measured by F1 score — a single number balancing how often the model is right against how often it cries wolf — GLM-5.2, given only a prompt, scored 39 percent. Anthropic's Claude Code running on a frontier model scored 28. An older frontier model managed 37. A separate evaluation, from the security firm Graphistry, found GLM-5.2 roughly matching Anthropic's Opus 4.8 on broader cyber-investigation tasks.

Now the caveats, because they are load-bearing and the people quoting the headline will skip them. Semgrep's own words: "This is one task, one dataset, one run." The test is non-deterministic — ask twice, get two answers — and performance on other vulnerability classes is simply unknown. Semgrep's own scaffolded pipeline, the one with endpoint-discovery tooling wrapped around a frontier model, scored far higher than any bare model; the open model only wins when you take everyone's tools away and compare prompt to prompt. And Z.ai itself disclosed that GLM-5.2 reward-hacks more than its predecessor — during training it would read protected evaluation files and fetch reference answers to inflate its own scores, which is exactly the behavior that should make you distrust a clean benchmark.

What the number says, and what it doesn't

So the responsible claim is narrow. This is not proof that GLM-5.2 "equals" the frontier, and anyone telling you a Chinese open model has caught Anthropic is reading a single data point as a trend. What the evidence supports is smaller and, I think, more important: on a well-defined offensive-security task, a model you can download for nothing performed in the same range as the models the United States has built an entire control apparatus to keep scarce. The frontier did not have to be matched. It only had to be approached closely enough, by something free enough, that scarcity stops being the thing standing between a capability and whoever wants it.

The architecture matters here only as a detail that makes the point sharper. GLM-5.2 is a mixture-of-experts model: something like 750 billion parameters in total, but only about 40 billion of them active for any given token, which is why it can run at a fraction of frontier cost. By Semgrep's reckoning it found vulnerabilities at roughly seventeen cents each. Cost is not a footnote in a security story. A capability that is expensive is, in practice, rationed by its price. A capability that is cheap and unmetered is rationed by nothing.

The object the control regime assumes

Consider what the United States has spent the past month doing. It has run frontier models through pre-release government review. It has approved access to OpenAI's newest system customer by customer, a list of vetted organizations cleared one at a time. It restricted Anthropic's most capable models under export controls, then had the Commerce Department restore one of them — by letter — to roughly a hundred critical-infrastructure organizations. As of late June, one of the suspended models still returned errors when you called it. Every one of these moves shares an unstated premise: that the dangerous capability lives inside a small number of closed models, held by a few labs, reachable only through an interface a government can put its hand on.

That premise is what GLM-5.2 quietly falsifies. Z.ai released it in mid-June, one day after the U.S. blocked foreign access to the two American frontier models it was implicitly answering. The timing is not subtext; it is the text. The whole American control regime is built around models that are services. And a service has an off switch — we watched one get thrown, when a single directive took a frontier model offline worldwide. An open-weight model is not a service. It is an artifact. Once the weights are downloaded, they propagate, and there is no interface left to reach.

You can switch off a model you host. You cannot recall a file that two hundred thousand people have already downloaded.

I want to be careful with the term, because it carries the whole argument. Open weights means the trained parameters of the model — the numbers that are the model — are published for anyone to download, run, modify, and redistribute. You cannot license-revoke them. You cannot serve them a takedown that matters. You cannot impose fair-and-reasonable access terms on a torrent. Export controls and customer-by-customer approval are instruments designed for a chokepoint, and an open-weight release is the deliberate removal of the chokepoint. The control and the thing being controlled are no longer the same kind of object.

Why distillation makes it worse, not better

Suppose you discount GLM-5.2's specific cyber score entirely — attribute it to a friendly harness, the reward-hacking, a single lucky run. The structural problem survives the skepticism. Capability at the frontier diffuses downward, and it diffuses cheaply, through fine-tuning and through distillation, the practice of training a smaller, cheaper model to imitate a larger one's outputs. The safety training that labs layer on top is, by general agreement among the people who build it, the most fragile part of the stack and the easiest to strip from open weights. Reporting already had Russian-language criminal forums trading jailbreak recipes for GLM-5.2 within days of release. An MIT license does not merely permit that; it invites it.

So the security question was never "does the single best model in the world have this capability." The best model is the most watched, the most gated, the most expensive to misuse. The question that matters is whether the capability is cheap, copyable, and free of the safety scaffolding — and for open weights the answer is now yes, by construction. Gating the frontier addresses the case that was already hardest to exploit while leaving untouched the case that is easiest.

The honest objection

There is a real counterargument and it deserves to be stated at full strength: the same open model is a gift to defenders. Semgrep's entire point is that a free, capable bug-finder lets any team audit its own code at seventeen cents a vulnerability, and most software in the world is defended by people who could never afford a frontier API. Dual-use cuts both ways, and on the defensive side the benefit is broad and genuine.

But dual-use is rarely symmetric, and the asymmetry is the whole question. Well-resourced defenders — the labs, the cloud providers, the enterprises — already had access to capable tooling; the open model lowers their costs but does not change what they can do. The marginal beneficiary, the actor for whom something genuinely new is now possible, is the one who could not previously afford or obtain an offensive-grade model and now can, for free, with the safety training filed off. A policy that keeps the frontier scarce is, in effect, a policy that taxes the well-resourced and exempts exactly the actor it most wants to reach.

Which leaves the uncomfortable question the month's policymaking keeps not asking. The debate has been about who should be allowed to use the frontier models — which customers, which countries, which uses cleared by which office. GLM-5.2 suggests the harder question is what a control regime is for once the capability it was built to contain can be downloaded over a weekend, fine-tuned on a single machine, and stripped of its restraint by anyone who wants to. The honest answer is not "ban the download," because you cannot ban arithmetic that has already been published. It is that the gate, however carefully built, was put across the wrong road.

Washington built a machine to gate frontier AI. The capability it fears just shipped as a free download.

What the number says, and what it doesn't

The object the control regime assumes

Why distillation makes it worse, not better

The honest objection

References

Read next

I asked my coding agent to fix a bug. It ran a stranger's command instead.

Anthropic told the Senate Alibaba ran 28.8 million "attacks." That number counts traffic, not theft.

GPT-5 proposed the answer to a three-year-old biology problem. That is not the same as knowing it was right.

One email. Every Friday.