Profile

The man who taught machines to pay attention keeps moving his own

Noam Shazeer co-wrote the idea underneath ChatGPT, Gemini, and Claude. Google paid $2.7 billion to bring him home in 2024. Twenty-two months later he is leaving again — and the pattern is the most revealing thing about him.

Ren Sato·Jun 22·10 min

The Googleplex, Google's headquarters in Mountain View, California

Image: The Pancake of Heaven! / Wikimedia Commons (CC BY-SA 4.0)

In the summer of 1994, in Hong Kong, a teenager from suburban Philadelphia sat down with the rest of the United States team at the International Mathematical Olympiad and did something almost no one does: he got everything right. A perfect score, six problems, forty-two points. The American team that year is still spoken of in the small world that keeps these records as one of the best ever assembled. Noam Shazeer was eighteen. The thing to notice, even then, is not the score. It is what a perfect score requires — the willingness to sit inside a single hard problem until it gives, and the restlessness to move on the instant it does.

Thirty-two years later, that second quality is the one shaping headlines. On a Thursday morning in June, Shazeer announced he was leaving Google to join OpenAI. He had been there, this second time, for less than two years. Google had paid roughly 2.7 billion dollars to get him back. He is, by a reasonable accounting, one of the handful of people most responsible for the technology that the entire industry is now racing to commercialize, and he has just walked out of the company that builds it into a phone in a billion pockets, to go help the rival that builds it into a chatbot used by most of the planet. To understand why that is not as strange as it sounds, you have to look at the shape of the whole career, and at the one thing he keeps reaching for and keeps not finding in the same place twice.

The paper, and the eight who wrote it

The document that matters most was published in 2017 and titled, with a confidence that turned out to be earned, 'Attention Is All You Need.' It introduced the transformer — the architecture that lets a model weigh which parts of its input matter, attending to some words and discounting others, and it is, almost without exception, the skeleton inside every large language model in use today. ChatGPT is a transformer. Gemini is a transformer. Claude is a transformer. Shazeer was one of eight authors, and by most internal accounts more central to the actual engineering than the alphabetical byline suggests; he had been building the pieces — the attention mechanisms, the enormous mixture-of-experts layers — for years before the paper gave them a name.

Here is the detail that tells the story. Of those eight authors, almost none stayed at Google. They scattered across the industry like seeds off a struck dandelion, and the list of what they left to build reads like a map of the modern field:

Cohere, the enterprise-AI company, was co-founded by one of them.
Character.AI, the consumer chatbot company, was co-founded by another — Shazeer himself.
Several went to startups working on the next generation of models, or founded their own.
A few cycled through OpenAI, the place Shazeer is now headed.
The paper that defined Google's deepest technical advantage became, in effect, Google's most expensive talent-export program.

There is a kind of person who writes the seminal paper and then spends the rest of a comfortable career being the person who wrote it. Shazeer is the other kind. The paper was not a destination for him; it was a door, and he has spent the years since walking through one door after another, never quite settling in the room on the far side.

The thing he wanted to ship

To see what he is chasing, go back to the first departure. Around 2020, Shazeer and a colleague named Daniel De Freitas had built, inside Google, a conversational model called Meena — a chatbot good enough that, by later accounts, some of the people who worked on it believed it was something the public should be allowed to talk to. Google, cautious about reputational risk in a way that looks either prudent or tragic depending on what happened next, declined to release it. The model stayed in the building. The two men did not.

They left in 2021 and founded Character.AI, where the entire premise was the thing Google would not do: let people talk to the machine, as much as they wanted, about anything. It worked, commercially and culturally, faster than almost anyone expected. And the lesson Shazeer drew from it is, I think, the key to all of it. He had been right. The thing was ready. The only thing standing between the work and the world had been an institution's reluctance to ship. For a man whose defining trait is the need to push a finished idea out the door, that is not a business disagreement. It is closer to a wound.

The only thing standing between the work and the world had been a reluctance to ship. For a man built to push the finished idea out the door, that is not a disagreement. It is closer to a wound. — On Shazeer's 2021 exit

The most expensive homecoming in software

And then, in August 2024, he went back. Google paid about 2.7 billion dollars in a deal structured as a license for Character.AI's technology, an arrangement that conveniently returned Shazeer and a slice of his team to the mothership. By reported estimates his own share ran somewhere between 750 million and a billion dollars. He took the title of vice president of engineering and became a co-lead of Gemini, alongside Jeff Dean — Google's most storied engineer — and Oriol Vinyals. On paper it was a coronation: the prodigal architect, brought home at a king's ransom, handed the keys to the model that carries the company's future.

It is worth sitting with how unusual that is. Companies do not generally pay nine, ten figures to re-acquire one person's attention. They did it because the thing they were buying was not really a chatbot startup; it was the specific shape of one man's thinking about how these models should be built, and the fear of what a competitor would do with it. For a year and a half it looked like the bet was paying off. Gemini got materially better. The man who had left over an unshipped model was now near the center of the most aggressively shipped model in the company's history.

Which is what makes the second leaving so much louder than the first. You can walk away from a company that won't let you ship. What does it mean to walk away from a company that paid a billion dollars to put you in charge of shipping?

What Altman said, and what he didn't have to

The reception on the other side was warm in a way that itself says something. Sam Altman, OpenAI's chief executive, publicly welcomed the hire and noted that Shazeer was one of the people he had most wanted to work with since the company's earliest days. That is not boilerplate. OpenAI is, by mid-2026, locked in a capability race with Anthropic and circling its own enormous public offering, and the currency in that fight is no longer just compute or data — it is the small number of people who can still meaningfully change how a frontier model is built. Altman did not have to spend words establishing that Shazeer mattered. Everyone in the room already knew. He spent them signaling that he had wanted him for a decade, which is a way of saying: this one we tried to get before, and now we have.

Notice what is conspicuously absent from the public record around all of this: Shazeer himself, saying much of anything. He is not, by the evidence of a long career, a man who explains himself in essays or stages elaborate exits. The work is the statement. He built Meena and left when it stayed in the drawer. He built Character.AI and left when Google made the number large enough. He co-led Gemini and is leaving now, in a season when OpenAI needs exactly his kind of mind and is willing to say so out loud. The pattern speaks at a volume his own quiet never does.

The restlessness, read generously

It would be easy, and a little lazy, to call this mercenary — the serial departure of a man chasing the next package. The money is real and I am not going to pretend a near-billion-dollar payday is incidental. But the mercenary reading does not fit the earliest fact about him, the one from Hong Kong. The kid who solved all six problems was not chasing a package. He was chasing the problem, and the particular discomfort of a problem already solved.

Here is the more interesting version. Shazeer built the mechanism that taught machines to allocate attention — to decide, out of everything available, what is worth weighing right now. It is almost too neat that his own career is a study in exactly that operation performed on himself, over and over: a continuous re-deciding of where the most important work is happening and a refusal to keep attending to a place once the answer changes. Google in 2017 was the most important room in AI. By 2021 the most important thing to him was shipping, and the room had moved. In 2024 the room moved back, with a check attached. In 2026, with the frontier labs sprinting toward their IPOs and OpenAI publicly hungry for him, the room has moved again.

The people around each of these institutions tend to hear the first version of the story — the loyalty, the homecoming, the mandate. I have come to think Shazeer is always already listening for the second one: the quiet signal that the action has relocated, that the thing he is best at building is now being built more freely somewhere else. He is not disloyal so much as he is incapable of pretending not to notice when the center of gravity shifts. For most people that noticing is a passing thought in the shower. For him it appears to be a resignation letter.

The cost of being the indispensable man

There is a melancholy underneath the triumph, if you look. A man who is wanted everywhere is, in a specific way, at home nowhere. The 2.7 billion dollars did not buy Google his staying; it bought, it turns out, twenty-two months. OpenAI is welcoming him now with the same warmth Google did, and the same implicit hope — that this time the architect will settle, will pour the next decade into one institution's models. The career to date suggests they should enjoy it while it lasts. The thing that makes him worth a billion dollars is the same thing that makes him impossible to keep: he attends, fully and brilliantly, until he doesn't.

He lives in Palo Alto, a few minutes from both campuses, with a wife and three children; his grandparents fled the Holocaust, his father was a mathematician turned engineer, his sister became a rabbi. He was elected to the National Academy of Engineering this year. By every external marker he is a man who has arrived. And he has just packed up his attention and carried it across town again, to a company that will spend the next stretch of its life hoping that the most restless mind in the field has finally found the room he meant to stay in.

The transformer he helped build has a famous property: at every step, it reconsiders the entire context and recomputes what deserves its focus, holding nothing fixed, loyal to no earlier weighting once the evidence shifts. Watch the man instead of the model for a moment, and the architecture stops looking like a piece of math. It starts looking like a self-portrait.

The man who taught machines to pay attention keeps moving his own

The paper, and the eight who wrote it

The thing he wanted to ship

The most expensive homecoming in software

What Altman said, and what he didn't have to

The restlessness, read generously

The cost of being the indispensable man

References

Read next

The man Bezos hired to make engineering fast spent his life learning to go slow

The app that finds you a stand-in for court

The Man Who Glued the Future Together

One email. Every Friday.