
Mistral OCR, new benchmark in character recognition
- Maxime Hiez
- Mistral AI
- 18 Apr, 2025
Introduction
In March 2025, Mistral AI announced the launch of Mistral OCR, an optical character recognition (OCR) API that sets a new standard in document understanding. This advanced technology enables complex documents to be processed and transcribed with unparalleled accuracy and speed, delivering document understanding capabilities at a level never before achieved.
Mistral OCR key features
Complex document understanding
Mistral OCR excels at understanding complex document elements, including interleaved images, mathematical expressions, tables, and advanced layouts such as LaTeX formatting. The model enables in-depth understanding of rich documents such as scientific articles with graphs, equations, and figures.
Multilingual and multimodal
The model is natively multilingual and multimodal, meaning it can process documents in multiple languages and formats. It supports PDFs, images, and uploaded documents, and can extract structured content while preserving the document hierarchy and formatting.
Top-Notch performance
Mistral OCR has consistently outperformed other leading OCR models in rigorous benchmark tests. Its superior document analysis accuracy is demonstrated by its ability to extract embedded images as well as text. Results are returned in Markdown format for easy analysis and rendering.
Mistral OCR highlights
- Complex document understanding
- Natively multilingual and multimodal
- Best-in-class references
- Fastest in its class
- Structured and rapid output
- Selectively available for self-hosting for organizations handling highly sensitive or classified information
Comparison with other OCR models
Mistral OCR stands out for its ability to understand and transcribe complex documents with unparalleled accuracy. Unlike other OCR models, Mistral OCR can handle multimodal and multilingual elements, offering a complete solution for document understanding.
Model | Overall | Math | Multilingual | Scanned | Tables |
---|---|---|---|---|---|
Google Document AI | 83.42 | 80.29 | 86.42 | 92.77 | 78.16 |
Azure OCR | 89.52 | 85.72 | 87.52 | 94.65 | 89.52 |
Gemini-1.5-Flash-00 | 90.23 | 89.11 | 86.76 | 94.87 | 90.48 |
Gemini-1.5-Pro-002 | 89.92 | 88.48 | 86.33 | 96.15 | 89.71 |
Gemini-2.0-Flash-00 | 88.69 | 84.18 | 85.80 | 95.11 | 91.46 |
GPT-4o-2024-11-20 | 89.77 | 87.55 | 86.00 | 94.58 | 91.70 |
Mistral OCR 2503 | 94.89 | 94.29 | 89.55 | 98.96 | 96.12 |
Using Mistral OCR
Mistral OCR is available via the mistral-ocr-latest API, offering a processing capacity of 1000 pages per dollar, and approximately twice as many pages per dollar in batches. The API is accessible today on the Platform development suite.
Conclusion
Mistral OCR represents a significant advancement in optical character recognition, offering a new level of document understanding capabilities. With its accuracy, speed, and multilingual and multimodal versatility, Mistral OCR is ideal for organizations seeking to harness the potential of unstructured information.
Sources
Did you enjoy this post ? If you have any questions, comments or suggestions, please feel free to send me a message from the contact form.
Don’t forget to follow us and share this post.