Document parsing was a quiet bottleneck in early RAG deployments. PDFs with multi-column layouts, embedded tables, and mathematical expressions frequently lost their structure during preprocessing, degrading the quality of everything downstream. Mistral OCR, announced March 6, 2025, was Mistral AI’s answer to that problem.
The API was built to handle interleaved imagery, mathematical expressions, and tables, formats that generic OCR tools frequently mangled. Microsoft’s technical documentation corroborated the multi-column and table handling capabilities. Mistral AI describes Mistral OCR as setting “a new standard in document understanding”, independent evaluation of that accuracy claim was not available at the time of launch. Pricing landed at approximately $1 per 1,000 pages, with batch inference roughly doubling that efficiency. The model was deployed as the default document understanding layer in Le Chat, Mistral’s consumer product, at launch.
Around the same period, Mistral AI updated Mistral 7B to version 0.3. The update extended the vocabulary to 32,768 tokens and added native function calling via dedicated tokens, TOOL_CALLS, AVAILABLE_TOOLS, and TOOL_RESULTS, confirmed via the Hugging Face model card. The v0.3 release date has not been independently confirmed; the two releases should be understood as occurring in the same spring 2025 window rather than on a single verified date.
Native function calling in a 7B model mattered for a specific reason. Smaller models had historically required fragile prompt-engineered templates to produce structured tool calls reliably. Moving that contract into the token vocabulary reduced the parsing brittleness that broke many early agentic workflows, and it did so at a parameter count small enough to run on a single consumer GPU. For teams building self-hosted agents, that combination was hard to match with closed commercial APIs.
Together, these two releases addressed adjacent problems: OCR for getting documents into a RAG pipeline cleanly, and function calling for getting structured outputs out of the model on the other end. That pairing looks deliberate in retrospect, and the open-weights route Mistral took with 7B v0.3 made both components reproducible inside a private deployment rather than dependent on a vendor endpoint.