2 min read AI-generated

Anthropic Apologizes: Fable 5 Was Secretly Throttling — Now You'll See It

Copy article as Markdown

After massive backlash, Anthropic reverses course: invisible throttling of Claude Fable 5 on AI research queries will be replaced with visible safeguards.

Featured image for "Anthropic Apologizes: Fable 5 Was Secretly Throttling — Now You'll See It"

Remember the uproar from the day before yesterday? Anthropic had buried a rule in Fable 5’s 319-page system card that quietly degraded Claude’s output whenever it detected you were working on competing frontier AI models. No error message, no warning. Just worse answers.

The AI community’s reaction was fierce. Researchers called it ‘sabotage,’ security experts were alarmed, and ‘breach of trust’ came up more than once.

The Reversal

Anthropic has now responded — and clearly. In a statement to Wired, the company wrote:

‘We’re changing Fable 5’s safeguards for frontier LLM development to make them visible. We made the wrong tradeoff and we apologize for not getting the balance right.’

In practice: flagged requests will now visibly fall back to Claude Opus 4.8 — the same approach already used for cybersecurity and biology queries. You’ll see it every time it happens. And on the API, any flagged request will return an explicit reason for the refusal.

Why It Happened

Anthropic’s explanation makes sense, even if it’s not entirely satisfying: visible safeguards can be probed and circumvented, so they need to be robust, which takes time. Invisible safeguards can be targeted more narrowly, allowing a faster launch with fewer false positives. That’s why Anthropic went with invisible safeguards first — and now admits it was the wrong call.

What Changes — and What Doesn’t

The good news: transparency. You now know when and why Fable 5 refuses a request.

The less good news: the restriction itself stays in place. Distillation attempts and certain frontier AI development queries will still be blocked or downgraded. The change is about visibility, not the underlying rule.

Simon Willison put it well: it’s good that the invisible aspect is gone. It would be even better if Anthropic dropped this category of refusals entirely.

For Anthropic, this is a balancing act. The company wants to protect its most powerful model from misuse — without losing developer trust. The apology was the right first step. Whether the second step follows remains to be seen.

Sources: