Mistral AI introduces Document AI, a modular platform for automated document processing that combines character recognition, structured data output, and natural language processing with flexible deployment options.
Document AI can extract text from PDFs, PowerPoint and Word files, handwritten notes, tables, diagrams, and complex layouts with high accuracy.
Beyond simple text recognition, Document AI includes an advanced annotation feature that lets users extract targeted information from documents and convert it into custom JSON formats.
Mistral offers two annotation types: "BBox Annotation," which tags and describes individual visual elements like diagrams, tables, or signatures, and "Document Annotation," which captures the structure of an entire document. The latter is currently limited to source files of up to eight pages.
Both options enable automated extraction of specific content, such as contract clauses, invoice amounts, transaction data from receipts, or chapter headings and URLs from scientific PDFs.

Annotations are based on user-defined data models and can be combined with a vision-capable language model to interpret even complex layouts and content.
According to Mistral, the platform is a good fit for organizations handling large volumes of diverse documents and looking for high levels of automation. Annotation features require more compute than basic OCR and are billed separately.
Multilingual support across 40+ languages
One key feature of Document AI is support for over 40 languages, including many non-Latin scripts. The system can recognize text from handwritten documents or challenging layouts, and Mistral claims an accuracy rate above 99 percent.
The platform is designed for a range of sectors, including government agencies, energy companies, research organizations, and legal departments. It also supports training domain-specific OCR models through fine-tuning. For example, users can analyze medical records or contracts using custom extraction rules.
Local or cloud deployment
Document AI can run on-premises or in private cloud environments, making it suitable for organizations with strict data protection, sovereignty, or regulatory requirements—especially in Europe or security-sensitive industries.
Companies can use the platform to build end-to-end document pipelines, from text recognition and extraction to automated analysis. The API is accessible through Mistral's developer platform, la Plateforme, and a free trial is available via the chat interface, le Chat.
Processing 1,000 pages through the API costs one US dollar. Extracting information in a predefined format ("annotations") costs three US dollars per 1,000 pages.
Mistral first introduced its OCR API in March 2025 as a foundation for Document AI. That API marked the company's initial move into modular document processing, combining fast text recognition with structured data output.