Anthropic releases a “safe” version of its Mythos AI technology
In April, Anthropic unveiled a new AI system called Claude Mythos.
But the company said it can’t share its new AI with the public because Mythos could become a powerful tool for hackers looking for ways to break into computer networks. The warning sparked panic among executives and government officials who feared the technology would usher in a new era of cyber threats.
Now Anthropic is releasing a straitjacketed version of Mythos that scientists believe is safe for widespread use and relies on older technology for some of what it does, the company said Tuesday.
Called Claude Fable 5, this new system includes additional guardrails designed to block responses related to cybersecurity, biology, and other vulnerable areas. Because of these barriers, hackers may have difficulty attacking computer networks using Fable. However, businesses and cybersecurity professionals may also struggle to defend their networks with the new system.
Most of the queries from Claude users in areas that could be perceived as too risky, the company said, would be handled by Claude Opus 4.8, which was released last month and was also designed to avoid Mythos security risks.
“That’s at the heart of why we were able to release Fable,” said Anthropic’s head of product management, Dianne Penn.
Anthropic has shared an initial version of Mythos with about 40 organizations that maintain critical computing infrastructure so they can use the system to patch security vulnerabilities before hackers exploit them. This month, Anthropic said it will share Mythos with 150 other organizations.
Cybersecurity experts disagreed on whether Anthropic was right to limit the release of the technology. Some applauded the company for restricting access to the Mythos, while others criticized Anthropic for not sharing it with a wider group of researchers who could try it out, see what it could and couldn’t do, and start using it defensively.
Mythos, which is still available to a limited number of organizations, does not have built-in guards in Fable.
External researchers have shown that the guardrails placed on AI systems are not always reliable, a point that Anthropic researchers acknowledge and say they are working on. “That’s why we’ve invested so much in safety research,” Ms Penn said. “It’s an ongoing development.”
Fable outperforms all other publicly available AI technologies in overall performance, according to benchmarks by Vals AI, which tracks the performance of the latest AI technologies. Its overall performance is 5 percent higher than the Claude Opus 4.8, according to Vals AI tests.
The new system is also twice as expensive as Opus 4.8.
Fable is particularly adept at generating computer code and math, said Rayan Krishnan, CEO of Vals AI. But other systems still outperform Fable in other areas, including health care and tax assessment tasks.
Over the past few months, companies in the United States, China and other parts of the world have built artificial intelligence systems that are particularly good at writing computer code. If an AI system can write code, it can potentially identify vulnerabilities in software applications.
Hackers can then use the technology to attack computer networks. However, technology companies, other businesses, and cybersecurity professionals can use this technology to protect networks.
A week after Anthropic unveiled Mythos, the company’s main rival, OpenAI, said it was sharing similar technology with a limited group of partners. However, the company shared its GPT-5.4-Cyber model with a much larger group – hundreds of organizations. It said it will then release it to thousands of other partners in the coming weeks.
(The New York Times sued OpenAI and Microsoft in 2023, alleging copyright infringement on news content related to AI systems. Both companies denied the claims.)