14.04.2026 14:34

$100 million for containment: why Anthropic won't release Claude Mythos

Mikael Gorsky

AI researcher and lecturer at HIT – Holon Institute of Technology

There is a risk that attackers could use next-generation models to attack software that manages critical systems in banking, healthcare, logistics, energy and transportation. Photo: Kevin Ku / Unsplash.com

Last Tuesday, the good folks at Anthropic changed the world's balance of power by releasing a butterfly of rare beauty. The formal name of this butterfly is Greta Oro, but for the transparency of its wings it is popularly nicknamed Glasswing. Someone at Anthropic is clearly suffering from excessive encyclopedic knowledge, and a hundred-million-dollar project of the highest importance has been named after Glasswing, writes Mikael Gorsky, an AI researcher at the Holon Institute of Technology and author of The AI Pravda.

What happened

While preparing the release of their new model, Claude Mythos Preview, researchers at Anthropic discovered that the model is a talented hacker, capable of identifying and exploiting vulnerabilities in existing code with inhuman strength and speed. The model's skills, habits, and flaws are described in a 245-page model card.

The power of the model impressed Anthropic so much that the company decided that it would not release it for general use and would limit itself to a press release describing the merits and achievements of Claude Mythos. As is customary in the industry, the press release includes a long list of ratings in which the model came close to a 100% result, but statistics of this kind are tricky, conditional and dubious.

Also, a press release from Anthropic mentions Claude Mythos' accomplishments as a computer systems security auditor:

- Claude Mythos Preview autonomously discovered several thousand previously unknown vulnerabilities in popular operating systems, web browsers, and other code.

- Claude Mythos Preview found a vulnerability in the OpenBSD operating system that could cause any OpenBSD host responding via TCP to fail, potentially shutting down corporate, government, and Internet servers. It went undetected for 27 years until Claude Mythos Preview discovered it, among other OpenBSD vulnerabilities.

- Claude Mythos Preview discovered a chain of bugs in the Linux kernel that allowed it to gain root access and take control of the system. This vulnerability has been fixed.

In fact, the moment that AI industry leaders have been talking about for years has arrived - the new version of LLM is so powerful that humanity will be sickened by the fact that this model will be used by attackers.

An excited Anthropic has released a 6 minute movie about how worried they are about what happened. Serious dudes with serious faces: the VP of Microsoft, the head of the Linux Foundation, the heads of CrowdStrike, Palo Alto Networks told us how they fear that attackers could use next generation models to attack software that controls critical systems in banking, healthcare, logistics, energy and transportation, posing serious risks to individuals, the economy and national defense.

In addition to the disturbing video, Anthropic has also created a $100 million project that gives members of a consortium of about 50 leading U.S. IT companies exclusive access to Claude Mythos and tokens worth the above amount. The goal is for companies to utilize the model's amazing power abilities to discover and fix vulnerabilities and sources of potential problems before they are discovered by malicious users of next-generation models.

The community of developers of open source projects has not been forgotten. However, only $4 million in credits has been allocated to them (the cost of using the model, it must be said, is impressive: $25 per million source tokens and $125 per million resulting tokens).

I haven't seen any public discussion of whether Claude Mythos will be provided to FBI/NSA/CIA/FinCEN [FBI/National Security Agency/CIA/FinCEN] and other serious agencies, although Dario, in an excited video, talks about how important it is to protect government resources and services from potential attacks created and directed by powerful LLMs. I imagine that cybersecurity professionals working for U.S. government agencies remember Trump's brave post banning all government offices from using Anthropic products.

However, Trump's ban didn't stop Treasury Secretary Scott Bessent from holding a meeting with major banks with a single agenda - preparedness for new cyber threats.

Conclusion one

It is clear that the ability to allow or prevent cyberterrorists from hacking into the security systems of nuclear and hydroelectric power plants is more power over people's lives than is available to the rulers of the vast majority of nations. In other words, the time has already come when private companies, the creators of LLMs, have more power than the elected leaders of countries. There is nothing pleasant or useful in this situation - neither for the world, nor for the leaders and shareholders of these private companies.

Conclusion two

For both technological (specifics of model training) and social (spread of knowledge and experience) reasons, the best open source models are about 9-12 months behind the best models from the main AI labs. As the complexity and power of advanced models increases, this gap may grow a bit, for training very powerful models requires many, many top GPUs from Nvidia. However, the U.S. administration's unwise policy of relaxing chip export restrictions as much as possible, as well as unlimited GPU shipments to Arab monarchies, could change the situation.

But there is no scenario in which models similar to Mythos will not appear in open source within a couple of years. And open source = Iran, North Korea, Russian Federation. China, of course, as well.

Conclusion three

The mighty Claude Mythos will fix most of the vulnerabilities in the software at some point, and after that he'll get to work building the next generation of LLMs. And he'll succeed, unless of course Trump destroys Anthropic before then. As a result, the Ma will maintain and continue to maintain its dominance in AI, and that dominance, as we can now see, is trawling for a return to a unipolar world. To which congratulations are in order.

Conclusion four is for Israeli moms and grandmothers.

If your kid is hesitating whether to take a cybersecurity course for kids or join a theater class, let him leave his doubts behind. Let robots take cybersecurity classes now. All humans go to theater!

P.S. It would be oh so strange if the instinctive reaction of people who are not inclined to think through what they say is not "they're the ones doing the marketing". Uh-uh. The coolest marketing, I must say, is "we made shit we're afraid of". You can also add up the salaries of the people interviewed by Anthropic for their video and calculate how much such a "disinformation campaign" would cost. You can also add up the capitalization of the companies involved in Project Glasswing and imagine it going to zero the moment someone reveals their deception. Powerful Anthropic marketing at the cost of the GDP of half of Eurasia :-). And that's if you forget for a second that Anthropic's revenue this year is expected to be $30 billion, that the only limitation to that revenue is the lack of computing power to produce tokens. All in all, the haters are funny as always.

This article was AI-translated and verified by a human editor

#Donald Trump

#China

#Anthropic

#Dario Amodei