Anthropic raises Batches API output cap to 300k tokens

Quiet announcement with big consequences: Anthropic has raised the max_tokens limit on the Message Batches API to 300,000 tokens – for Claude Opus 4.6 and Sonnet 4.6. Set one beta header, and you get outputs that previously required workarounds.

What changes

Until now, the output cap on the Batches API was a lot lower than the context window. If you wanted to generate a 200-page document, you had to slice it into multiple requests, cache the parts, stitch them back together. With the new output-300k-2026-03-24 header, a single batch request gets up to 300k tokens of output – more than enough for long reports, large code generation tasks, or structured datasets.

One caveat: this is Batches API only, not the synchronous Messages API. If you need a live response, the regular output caps still apply. But for async workloads – overnight generation, bulk translations, automated reports – this is a real leap.

The other big update: 1M context becomes standard

In parallel, Anthropic announced the 1M token context beta for Claude Sonnet 4.5 and 4 will be retired on April 30. If you need the million tokens, migrate to Sonnet 4.6 or Opus 4.6 – both ship with the big context window at standard pricing and without a beta flag. No more ‘premium feature’, just default behavior.

My take

Both changes fit the pattern Anthropic has been running since the start of the year: limits that were once introduced as ‘beta’ become the default experience. That makes planning easier – if I build a pipeline today, I don’t have to worry that a beta flag will get pulled in three months. At the same time, 300k output tokens enables use cases that were previously just too fiddly: entire books in one batch request, full codebases generated from a spec, long research reports in a single shot.

For teams running Claude in production, this week’s API release notes are worth a careful read.

Sources: