Mythos Hacked: Unauthorized Access to Anthropic's Most Dangerous Model

This is the last headline Anthropic needs right now: a small group has gained unauthorized access to Claude Mythos — the model Anthropic itself deemed too dangerous for public release.

What happened

Bloomberg reported on April 21 that members of a private Discord channel focused on unreleased AI models obtained access to Mythos. The group exploited credentials from an employee at a third-party contractor working with Anthropic. They also made an educated guess about the model’s URL, based on knowledge of the format Anthropic uses for its model endpoints.

The timing is remarkable: access was gained on the same day Anthropic first announced plans to make Mythos available to select companies for testing.

Why this matters

Mythos isn’t an ordinary language model. Built under Anthropic’s Project Glasswing initiative, it was specifically designed for cybersecurity applications — capable of discovering zero-day vulnerabilities in operating systems and browsers, chaining software bugs into multi-step exploits, and demonstrating capabilities previously reserved for the most skilled human hackers.

That’s precisely why Anthropic kept access tightly restricted. Early users included Goldman Sachs, Apple, and select security firms — all under strict conditions.

Anthropic’s response

Anthropic told TechCrunch it is investigating the reported unauthorized access through a third-party vendor environment. The company says there’s no evidence its own systems have been impacted. The group has apparently been using Mythos regularly and provided Bloomberg with screenshots and a live demonstration — though reportedly without malicious intent.

The bigger picture

For Anthropic, this incident comes at the worst possible time. Peace talks with the White House are underway, Trump just signaled that a Pentagon deal is possible, and Amazon committed $25 billion in investment. A security incident involving the exact model being kept under wraps due to its dangerous capabilities undercuts the responsible AI development narrative.

It’s also not the first incident in recent weeks: the accidental Mythos leak in March, then the Claude Code source code appearing on npm, and now this. Three security incidents in four weeks — that’s a pattern, not a fluke.

The critical question isn’t whether the group caused harm. It’s how a model that Anthropic itself considers too dangerous for the public was accessible through a simple URL guessing attempt.

Sources: