2 min read AI-generated

Mistral OCR 4: Document Recognition with 170 Languages and 2,000 Pages per Minute

Copy article as Markdown

Mistral has launched OCR 4 — a document recognition model supporting 170 languages, paragraph-level bounding boxes, and self-hosting options. All for $4 per 1,000 pages.

Featured image for "Mistral OCR 4: Document Recognition with 170 Languages and 2,000 Pages per Minute"

Mistral just released its new document recognition model OCR 4 — and the specs are impressive. 170 languages, paragraph-level bounding boxes, 2,000 pages per minute on a single GPU, and API pricing at $4 per 1,000 pages.

What OCR 4 does

The model is a significant leap over its predecessor. The key additions: paragraph-level bounding boxes let you not only extract text but precisely locate it on the page. That’s critical for structured documents like contracts, invoices, or academic papers.

In Mistral’s internal benchmarks, OCR 4 achieves a 72% win rate against the previous version. The speed improvement is equally notable: 2,000 pages per minute on a single GPU matters for enterprise use cases dealing with large document volumes.

Why self-hosting matters

The most interesting aspect is the self-hosting option. Many companies — especially in Europe — can’t or won’t send their documents through external APIs. Contracts, patents, personnel files: all too sensitive for the cloud. Mistral offers the option to run OCR 4 on your own servers.

It’s a smart move. Mistral is positioning itself as the European alternative to Google Document AI and AWS Textract — with the advantage that data never has to leave your network.

My take

OCR doesn’t sound exciting, but it’s one of the most important AI applications in enterprise. Most companies are sitting on mountains of documents that need to be digitized, searchable, and analyzable. Whoever does that faster, cheaper, and more compliance-friendly has a real competitive edge.

$4 per 1,000 pages is aggressive pricing. For comparison: AWS Textract costs between $1.50 and $15 per 1,000 pages depending on the feature set. Mistral sits at the low end — with the bonus of letting you run the model yourself.


Sources: