Investigation · Security

The same AI that defends critical infrastructure can attack it. One company is deciding who gets the shield.

Anthropic's vulnerability-finding model is now scanning power, water, and healthcare systems across 15 countries. The tool that patches them is, by construction, the tool that could break them.

Sam Brenner·Jun 5·9 min

Image: Victor Grigas / Wikimedia Commons (CC BY-SA 3.0)

Anthropic put a number on it themselves, which is the place to start. Describing the expansion of a program it calls Project Glasswing, the company estimated that for most of the organizations now taking part, a successful major attack on their systems could affect more than 100 million people. That is the company's own assessment of the stakes, on the record, in its own announcement. Hold onto it, because the program that number describes is one in which an artificial intelligence is now reading the software that runs power grids, water systems, hospitals, and communications networks across more than fifteen countries, hunting for the flaws an attacker would use to get in. The model is very good at finding them. That is the entire point of the program, and it is also the problem with it.

The facts, as Anthropic laid them out on June 2 and as reported by TechCrunch, are these. The company is extending Glasswing to roughly 150 new organizations across more than fifteen countries, up from about 50 partners when the effort began in early April. The countries named are US-aligned: Australia, Canada, France, Germany, Italy, Switzerland, the Netherlands, Spain, Belgium, Sweden, India, Japan, New Zealand, and South Korea. The sectors are the ones a society cannot do without: power, water, healthcare, communications, and the hardware underneath them. Named participants include Okta, Samsung, SK Hynix, SK Telecom, NATO, and ENISA, the European Union's cybersecurity agency. The tool doing the work, which Anthropic calls Claude Mythos, is described by the company as able to identify thousands of zero-day vulnerabilities over a span of several weeks. Those are the company's words and the company's numbers. They are remarkable, and they are the reason to look closely rather than just to applaud.

A zero-day is a key, not a lock

Begin with what the tool actually produces, because the whole story turns on it. A zero-day is a vulnerability the defender does not yet know exists, which means there is no patch and no warning, zero days to prepare. Finding one is, by its nature, a dual-use act. The same discovery that lets a defender close a hole lets an attacker walk through it, and nothing about the discovery itself decides which use it serves. The flaw does not know who found it.

This is the fact that the framing around these programs works hardest to soften. There is no version of this tool that finds vulnerabilities for defense only. The capability is symmetric by construction. A system that can identify thousands of unknown flaws in critical infrastructure is, described without the comforting adjectives, the most efficient engine ever built for discovering ways to attack critical infrastructure. Aimed at defense, with the results handed to the people who can patch them, it is a shield. The identical artifact, with the results handed to anyone else, is a set of master keys to the power and water systems of fifteen countries. The code is the same. Only the recipient of the output changes.

There is no version of this tool that finds vulnerabilities for defense only. The capability is symmetric by construction. Only the recipient of the output changes. — Sam Brenner

The gatekeeper is a company

Now ask who decides where the shield goes, because the answer is the part that should give a policymaker pause. Anthropic chose the fifteen countries. Anthropic chose the roughly 150 organizations. The selection is reasonable on its face, in the way these things usually are: you start with allies and with partners who agree to participate, and the list of US-aligned nations reflects exactly that. But reasonable or not, it is a private company making an allocation decision with the weight of national security attached to it. These grids get scanned and defended. Those do not. That is a choice, and right now it is a corporate one.

Follow the knowledge, because that is where the real concentration sits. By running this program, Anthropic comes to possess something no government intelligence service has ever held in one place: a current, machine-generated catalogue of unpatched ways into the critical infrastructure of fifteen countries. The asymmetry at the center of every surveillance and security story is here in its purest form. The company knows where the holes are. The public, and in many cases the operators themselves until they are told, do not. A catalogue like that is not a neutral byproduct of doing good work. It is, in the wrong hands, the single most valuable target in the world, and the program manufactures one as a matter of routine.

The shield and the spear are proliferating

Anthropic says it is racing to establish safeguards within the program. Read that sentence as the admission it is. The capability is finished and deployed; the safeguards are still being built. The company is being more candid than most in saying so, and it does not change the order of operations, which is that the powerful thing exists now and the controls on it are a work in progress.

And Anthropic is not the only one holding the tool. OpenAI has released a comparable system, described for the same kind of testing, under the name GPT-5.5-Cyber. So this capability is not contained in a single company that one might at least try to regulate as a single point. It is proliferating across the frontier labs, and proliferation is a one-way street. Once two companies can build a machine that finds thousands of zero-days, the relevant question stops being whether such a thing exists and becomes how many actors will eventually have one, including actors with no interest in defense and no list of allied countries to protect. The existence proof alone pushes the capability outward. A tool like this does not remain defensive because anyone intends it to. It remains defensive only for as long as control over it holds, and control is precisely the thing no one in this story has yet demonstrated.

A disclosure problem that was already breaking

We have written recently about how the human machinery for handling software flaws, the coordinated disclosure process that is supposed to move a vulnerability from the person who found it to the vendor who can fix it, was already under strain when the finders were individual researchers turning up individual bugs. That system runs on trust, on timelines, and on the goodwill of vendors, and it was failing often enough to make researchers angry. Now picture that same process asked to absorb thousands of zero-days, found at machine speed, across a hundred and fifty organizations in fifteen countries, all at once.

The questions that were hard for one bug become acute at that scale. Who gets told first when the tool finds a flaw that exists in systems in eight countries at once. Who patches first, and who waits. What happens during the window between discovery and fix, the interval in which the vulnerability is known to the finder and not to the defender, except that now the window holds thousands of flaws and the finder is a company managing a queue. With a single human researcher, that window is one disclosure that can be tracked. Here it is a backlog of live, unpatched ways into essential systems, sitting on a company's infrastructure, waiting its turn. Automation did not shrink the disclosure problem. It multiplied it, and it concentrated it.

Which returns us to the catalogue, because the catalogue is the disclosure problem made physical. Every flaw the tool finds and has not yet seen fixed is, until it is patched, a target. The list of those flaws is a map of how to take down the power and water and hospital systems of fifteen countries, assembled in one place by a machine. Protecting that map is now itself a critical-infrastructure problem, one that did not exist until the program created it. The defenders have, in the course of building a shield, also built the most dangerous document in the room and made themselves responsible for guarding it.

What it would take

None of this is an argument against finding vulnerabilities before the people who mean harm do. The defensive value is real and it is not small. A flaw that Claude Mythos finds and that an operator patches is a flaw that does not become an outage, a ransom, or a blackout. If the program works as described, it will prevent attacks that would otherwise have happened, and the people spared will never know. That has to be said plainly, because the case for the program is genuine.

The argument is narrower, and it is about accountability rather than capability. A tool this powerful, this inherently dual-use, and this concentrated in private hands should not be governed only by a single company's internal safeguards and its own choice of which fifteen countries are worth protecting. What the record supports asking for is specific. Independent oversight of who holds these capabilities and on what terms, rather than each lab policing itself. Disclosure rules written for machine-scale discovery, instead of the human-scale process that is about to be overwhelmed by it. A custody standard for the catalogues these tools generate, treated as the critical-infrastructure assets they have become. And a clear answer, settled in advance rather than after the fact, to the question of who is accountable when a map of every unpatched hole in a nation's grid is sitting on a corporate server, and the map gets out.

Anthropic set the figure at more than 100 million people. The company offered it as the scale of harm its program is built to prevent. Read it the other way and it is the same number measured from the opposite side: the scale of harm available to whoever ends up holding the tool, or the catalogue it produces, without the intention to defend. The shield and the spear are the same code. The only thing that decides which one a society is looking at is who is holding it, and at the moment that is a decision a small number of companies are making by themselves, on terms they wrote, with the rest of us downstream of whether they get it right.

References

Privacy

The same AI that defends critical infrastructure can attack it. One company is deciding who gets the shield.

A zero-day is a key, not a lock

The gatekeeper is a company

The shield and the spear are proliferating

A disclosure problem that was already breaking

What it would take

References

Read next

Twenty-nine countries founded an AI organisation in Shanghai. The agreement does not say what it can do.

China cleared Apple Intelligence this week. The approval that counted was of the two Chinese models it now runs on.

Apple didn't just lose its gatekeeper case. It lost the right to argue before it complies.

One email. Every Friday.