AI · Distillation

Anthropic told the Senate Alibaba ran 28.8 million "attacks." That number counts traffic, not theft.

The headline figure is real and auditable. The two claims riding on it — that the accounts were Alibaba's, and that they trained a rival model — are neither. Here is how to read a forensic accusation written as a policy ask.

The west front of the United States Capitol, where Anthropic sent its letter to the Senate Banking Committee.

Image: Architect of the Capitol (public domain), via Wikimedia Commons

Every few weeks a number arrives with its verdict already attached. This week's is 28.8 million. That is how many exchanges with Claude, Anthropic told the US Senate Banking Committee in a letter dated June 10, came from a coordinated campaign it attributes to Alibaba — "the largest known distillation attack" the company says it has ever documented, run through roughly 25,000 fraudulent accounts and commercial proxy services over about six weeks this spring. The figure is enormous, it is precise, and it has been reposted across the week with the word "theft" welded to the front of it. So before it hardens into something everyone knows, the question this column always asks before reposting: 28.8 million what, measured how, and proving exactly what?

Start with what's solid, because some of it is. Counting API requests is the one thing a model provider can genuinely do well. Anthropic sees every call that hits Claude; if it says 28.8 million exchanges traced to a cluster of accounts it had flagged, that is the kind of claim its own logs can support, and there is no obvious reason to doubt the raw tally. The shape of the cluster is ugly in a recognisable way, too — roughly 25,000 accounts, opened to look unrelated, routed through commercial proxies to defeat the geographic limits on who is allowed to use the model at all. That pattern is consistent with someone deliberately scraping a service against its terms, at scale, while trying not to be seen doing it. Something happened here. Credit the count.

Three claims wearing one number

The trouble is that "distillation attack" is not one claim. It is three, stacked on top of each other, and they are not equally knowable. Claim one is volume: 28.8 million exchanges occurred. That is auditable and probably true. Claim two is attribution: those accounts belonged to Alibaba — specifically, per the reporting, operators linked to its Qwen model lab. That is a forensic claim about who was on the other end of a proxy. Claim three is purpose and effect: the harvested outputs were used to train a competing model, and the capability actually transferred. That is an inference about something happening inside another company's training pipeline, which Anthropic cannot see and has not seen.

The headline borrows the certainty of claim one and lends it to claims two and three. But only the first is the kind of thing that lives in a log file. Attribution behind 25,000 deliberately-anonymised accounts is an investigation, with a method and a false-positive rate, neither of which has been published. And the leap from "they collected our outputs" to "they built a rival on them" is the part no provider can establish from its own side of the wire. You can watch every drop of water leave your tap. You cannot, from that, prove what someone built in a house you have never entered.

What distillation is, and why it resists proof

Distillation as a technique is old, legitimate, and genuinely effective — which is exactly why the accusation is plausible and exactly why it is hard to nail down. The idea goes back to a 2015 paper by Geoffrey Hinton and colleagues: take a strong, expensive "teacher" model, feed it a pile of queries, collect its answers, and train a smaller, cheaper "student" to imitate them. The student inherits a surprising amount of the teacher's behaviour at a fraction of the cost. Done to your own model, it is standard practice. Done to a competitor's API without permission, it is what Anthropic is calling an adversarial distillation attack. Nobody disputes that this works. The dispute is whether you can ever prove it was done to you.

Here is the seam. You cannot fingerprint a finished model's training data by looking at the model. Two systems can converge on similar answers because they were trained on overlapping slices of the same internet, not because one copied the other. Qwen is trained on an enormous corpus of its own; any modern frontier model and any capable open-weights model will look somewhat alike on the tasks everyone optimises for, because they are all climbing the same hills. To move from suspicion to proof you would need one of two things: the other lab's training records, which you will never get, or a statistical signature in the student that could only have come from your teacher — a planted canary string, an idiosyncratic refusal style, a watermark that survives the copying. Anthropic's letter, as it has been described publicly, asserts the pattern of access. It has not, so far, published the signature that would close the loop.

Counting requests proves a scrape happened. It does not prove a model was built from them. Those are different sentences, and only one of them is in the logs.

Compared to what?

The bigness of the number is doing rhetorical work, so it is worth asking the boring question: 28.8 million compared to what? Over six weeks, against a service that fields billions of API calls, the figure is a small fraction of total traffic. That does not make it harmless — concentrated, fraudulent, terms-violating access is a real problem at any size — but the alarm should come from the concentration and the intent, not from the raw magnitude, which is being presented as if it were self-evidently a siege. A number can be both large in isolation and minor as a share of the system, and which framing you reach for tells the reader what you want them to feel.

Then there is the comparison the letter explicitly invites: that this campaign "dwarfs" the roughly 16 million exchanges across some 24,000 accounts that Anthropic attributed in February to three Chinese labs — DeepSeek, MiniMax and Moonshot. Set the two side by side and you have a tidy escalation story. But both figures were produced by the same accuser, using the same undisclosed attribution method, scoring the same kind of evidence. Bigger-than-last-time is only meaningful if you trust the ruler, and the ruler has not been shown. An internal estimate that grows is not the same as an external fact that was independently confirmed and then grew.

The scariest claim has no error bars at all

The most alarming line in the letter is the most mechanistic one: a model trained only on Claude's outputs, Anthropic warns, inherits the capability profile without the alignment pipeline — it can match Claude's competence at software engineering or agentic reasoning while shedding the safety conditioning that was supposed to come with it. As a mechanism, that is entirely believable. Imitating what a model says does not require importing the guardrails that shape what it refuses to say. But believable is not demonstrated, and there is no measurement here at all — no benchmark of a resulting model, no test showing that a specific student is both capable and unaligned, only the assertion that such a thing could be produced. The genuinely worrying part of the argument is the part carrying zero evidence, and it is being asked to support the heaviest policy.

Read the document for what it is

All of this matters more because of where the number lives. It is not in a reproducible forensic report or a methodology section other researchers can check. It is in a letter to two senators — Banking Committee chair Tim Scott and ranking member Elizabeth Warren — that asks for specific things the sender stands to benefit from: tighter export controls on advanced AI chips, mandatory screening of high-volume API usage, and antitrust clarification so that competing labs can legally share information about suspected distillation. Those may all be defensible policies. The point is narrower and unavoidable: an advocacy document is built to present its strongest case, and the evidentiary bar for "told the Senate" is lower than the bar for "showed the work." When the White House's own April memo had already flagged distillation as a national-security concern, the letter is pushing on a door that was already open — and the incentive in that situation is to supply the biggest clean number, not the most heavily caveated one.

What would turn suspicion into proof

None of this is a defence of Alibaba, which has declined to comment, and which may well have done exactly what it is accused of. It is a list of what is still missing before the headline earns the word "theft." Four things would move this from a measurement of suspicion to a finding of fact:

  1. The attribution method: how 25,000 deliberately-unlinked accounts were tied to a single operator — shared payment rails, shared infrastructure, behavioural fingerprints — and crucially, the false-positive rate of that method.
  2. A capability signature: a canary or held-out probe planted in Claude that surfaces in the suspected student model and could not have arrived any other way.
  3. The denominator: 28.8 million set against total traffic and against the volume a single large legitimate customer generates, so readers can judge concentration rather than just absorb a big number.
  4. An independent look: third-party forensic access to the logs, the way a serious breach gets an outside investigator, instead of a tally summarised inside a letter to lawmakers.

The honest version of this story is narrower than the headline and still serious. Someone ran a large, fraudulent, terms-violating scrape of Claude; Anthropic has logs it finds alarming and a competitor it is now willing to name in public; and the policy machinery it wants was already in motion. That is a real event worth covering. It is not yet the same thing as a proven theft of a model, and the distance between those two descriptions is the entire question in front of the Senate. Until the method is on the record, 28.8 million is a precise measurement of how much traffic looked suspicious — not of how much was stolen. When a number turns up already knowing what it means, that is precisely the moment to look harder.

References

  1. CNBC — Anthropic accuses Alibaba of campaign to 'brazenly' and 'illicitly' extract AI capabilities
  2. The Next Web — Anthropic accuses Alibaba of running largest distillation campaign against Claude
  3. TechTimes — Alibaba ran largest known AI theft campaign against Claude, Anthropic tells Senate
  4. OODAloop — Anthropic says Alibaba illicitly extracted Claude AI model capabilities
  5. Hinton, Vinyals & Dean (2015) — Distilling the Knowledge in a Neural Network
The Friday Brief

One email. Every Friday.

The week's machines, money, and people — in under five minutes.