2 min read AI-generated

From PyTorch Model to Browser App: How Simon Willison Ported Moebius With Claude Code

Copy article as Markdown

A 0.2B inpainting model that normally needs CUDA — running entirely in the browser. Simon Willison had Claude Code convert the model to ONNX, push it to Hugging Face, and build the web app around it. A neat showcase of how far Opus 4.8 reaches.

Featured image for "From PyTorch Model to Browser App: How Simon Willison Ported Moebius With Claude Code"

Sometimes a small weekend experiment shows where we are with AI tools better than any product launch. Simon Willison spotted Moebius on Hacker News — a compact inpainting model where you mark regions of an image and the model imagines what should fill the gap. The catch: it normally needed PyTorch and NVIDIA CUDA.

0.2B sounds like “this could run in a browser”

Willison paused at the name: 0.2 billion parameters — small enough to try running directly in a browser via WebGPU. Instead of grinding through it himself, he handed the job to Claude Code. And it took on the whole chain: convert the PyTorch model to ONNX, publish the result to Hugging Face, then build a web app and interface that loads and runs the model.

The remarkable part isn’t that any single step works — it’s that Opus 4.8 walks the entire path in one go. Model conversion, hosting, frontend: those are usually three different hats, and here one agent puts them on in sequence.

It runs. In every browser.

The result is a working demo that runs in Chrome, Firefox, and Safari. One finding Willison highlights: the CacheStorage API copes with model files of around 1.3 GB. In plain terms — inpainting can be a feature of a client-only web app. No server, no GPU cloud, no per-request API cost. The model loads into the browser cache once and computes locally after that.

My take

This is exactly why I read Willison’s blog. Someone starts with “that looks small, wonder if it runs in a browser?” and ends up with a client-only app running a real ML model locally. That’s the kind of unplanned discovery these tools produce constantly — you ask one thing and end up somewhere else. What I find most interesting is the implication: if an agent can convert, host, and wrap a model into a web app, the barrier to just trying small models fully offline in the browser drops dramatically. Not every model is a tidy 0.2B — but a surprising number of useful ones are.

Sources: Simon Willison: Porting the Moebius 0.2B image inpainting model to run in the browser with Claude Code