MiniMax M3: The First Open Model With Frontier Coding, 1M Context, and Multimodality

MiniMax, an AI startup from China, has released M3 — a model that offers a remarkable combination: frontier-level coding, a one-million-token context window, and native multimodality, all in a single open-weight model. That’s a first.

What M3 Can Do

The benchmark numbers are impressive: 59.0 percent on SWE-Bench Pro, beating both GPT-5.5 and Gemini 3.1 Pro, and approaching Claude Opus 4.7. With an asterisk, though: the tests were run on MiniMax’s own infrastructure with their own agent scaffolding. Independent verification is still pending.

The Architecture

M3 is built on MiniMax Sparse Attention (MSA), a new architecture that cuts per-token compute at one million context to one-twentieth of the previous generation. The result: 9x faster prefill and 15x faster decoding.

This is the real breakthrough. Long context windows are useless if they’re impractically slow. MSA solves that problem — at least according to MiniMax’s own claims.

Open Weights — Almost

M3 has been available since June 1 through MiniMax Code and the API. Open weights and a technical report are expected on Hugging Face and GitHub within ten days — around June 10.

The Bigger Picture

Chinese AI models are catching up fast. After DeepSeek and Qwen 3, M3 is further proof that frontier performance is no longer a monopoly of the big US labs. A startup like MiniMax releasing a model that beats GPT-5.5 on a software engineering benchmark would have been unthinkable a year ago.

Whether the benchmark numbers hold up under independent review remains to be seen. But the ambition alone signals that the field is getting tighter.

Sources: MiniMax Blog, The Decoder, TechTimes