Bigger isn’t always better. That’s the core message behind OpenAI’s latest release: GPT-5.4 Mini and GPT-5.4 Nano dropped on March 17, and they’re not about more parameters — they’re about more speed and lower costs.
What’s inside?
GPT-5.4 Mini runs more than twice as fast as its predecessor GPT-5 Mini — and gets surprisingly close to the full flagship model. On SWE-Bench Pro, Mini scores 54.4%, compared to 57.7% for the full GPT-5.4. On OSWorld-Verified, which tests how well a model can operate a desktop computer, Mini hits 72.1% — just shy of the flagship’s 75%.
Nano is another step down in size and price: $0.20 per million input tokens and $1.25 per million output tokens. For comparison, Mini costs $0.75 and $4.50. For batch processing and high-volume workloads, these numbers start to change what’s possible.
Built for the subagent world
The most exciting part isn’t raw performance — it’s the use case. OpenAI is explicitly positioning Mini and Nano as subagents. The idea: a large model handles planning and delegates specialized subtasks to Mini or Nano — fast, cheap, and in parallel.
This isn’t theoretical anymore. In Codex, OpenAI’s coding agent, this exact pattern is already running. And Simon Willison calculated on his blog that you can describe 76,000 photos for $52 using Nano. Those are the kind of numbers that unlock entirely new applications.
Who gets what?
GPT-5.4 Mini is available in ChatGPT — including for free users via the ‘Thinking’ option. When paid subscribers hit their GPT-5.4 rate limit, they automatically fall back to Mini. Nano is API-only.
My take
The trend is clear: large models are becoming conductors, small models are becoming specialized musicians. If you still think only the biggest model matters, you’re missing the real revolution. The future belongs to systems where multiple models collaborate — and Mini and Nano are built exactly for that.
For developers building agent systems, this is great news. For Anthropic and Google, it’s a signal that the competition over small models is heating up fast.
Sources: