Mistral AI's new OCR API processes documents, tables, and images with higher accuracy than current market solutions, according to Mistral's benchmark tests.
The system announced by Mistral AI processes text, media, tables, and mathematical equations. It can also convert complex infographics from documents into digital format. Tests show Mistral OCR achieving 94.89% accuracy, compared to Google Document AI at 83.42%, Azure OCR at 89.52%, and various Google Gemini models.
Language support and processing speed
The system processes multiple languages with 99.02% accuracy, exceeding Google Document AI (95.88%) and Azure OCR (97.31%). Using a lightweight architecture, it processes 2,000 pages per minute on one computing node. The cost is set at 1,000 pages per dollar, or 2,000 pages per dollar with batch processing.
A "doc-as-prompt" feature lets users input entire documents as AI instructions. For instance, when given a contract, the system extracts specific details like parties involved, terms, and payment information into an organized format.
Current applications include digitizing scientific papers at research institutions, preserving historical documents, and improving customer service knowledge bases. The system can structure the extracted information in JSON format, making it easier to use with other AI systems.
Mistral AI offers self-hosting for organizations needing enhanced safety. The API runs on their developer platform "la Plateforme" with cloud and inference partner support coming soon. While free testing should be available through Mistral's chat interface "le Chat", this feature wasn't working at publication time.