Sakana Fugu — Orchestration Over Single Models

Imagine sending a request to a model — and behind the scenes, a conductor decides whether to bring in the violins or the timpani. That’s Sakana Fugu in a nutshell.

What Fugu Does

Tokyo-based startup Sakana AI officially launched Fugu today. The idea: you send your request to a single endpoint — OpenAI-compatible, mind you — and Fugu decides how to handle it. Simple questions get answered directly. Complex tasks get distributed to a team of expert models from a swappable pool.

At the core sits a 7-billion-parameter model acting as the ‘Conductor.’ It’s been specifically trained to call other LLMs — including recursive instances of itself. Sounds like Inception for language models, and honestly, that’s not far off.

Two Tiers

Fugu comes in two flavors:

Fugu — balanced, low latency, for everyday use
Fugu Ultra — maximum accuracy, for multi-step problems

Sakana claims Fugu Ultra rivals Fable 5 and Mythos performance. The kicker: without the export control risks that come with Western models. For Asian and international customers, that could be a real selling point.

The Research Behind It

Fugu is based on two research papers presented at ICLR 2026: ‘Trinity’ and ‘Conductor.’ This isn’t a slapped-together product — there’s serious research underneath. The beta has been running since April 2026, and today marks the commercial launch.

Pricing and Availability

Standard: $20/month
Pro: $100/month
Max: $200/month

Important note for European readers: the EU is not included at launch. Sakana hasn’t said why, but regulatory hurdles seem like the obvious guess. For everyone outside the EU, Fugu is available starting today.

My Take

The approach is clever. Instead of building one ever-larger model, you orchestrate many specialized ones. It’s reminiscent of microservices in software development — and that pattern proved itself there.

Whether Fugu can actually keep up with the heavyweights remains to be seen. The benchmarks sound promising, but benchmarks are only half the story. The real test comes when developers start sharing their hands-on experience.

What interests me most: if the Conductor concept works, it could influence the entire industry. Why train a single massive model when a smart conductor with an orchestra can achieve more?

Sources: