AI in practice Archive

Dec 18, 2025

Mistral AI has released Mistral OCR 3, an updated version of its document analysis model. The system goes beyond basic extraction—it can interpret cursive handwriting, dense form layouts, and complex table structures, including linked cells. According to the company, this third version outperforms its predecessor in 74 percent of cases, showing particular strength in handling handwriting, scanned forms, and complex tables. OCR 3 also stacks up well against Deepseek's specialized character recognition model.

The model is available through an API or the Document AI platform introduced in May. Pricing sits at two dollars per 1,000 pages, with discounts available for bulk orders. The French company—which recently secured a large investment from chip manufacturer ASML—is using this release to solidify its position in document recognition, even as its current generation of open-weight language models trails behind commercial competitors from the US.

Comment Source: Mistral AI

Jonathan Kemper

Dec 18, 2025

AI in practice

OpenAI is launching a new learning platform for journalists and publishers called the "Academy for News Organizations." Developed in collaboration with the American Journalism Project and the Lenfest Institute, the initiative aims to teach media organizations how to effectively use artificial intelligence. The program offers on-demand training, practical examples for research and translation, and guidance on establishing internal AI guidelines. OpenAI says the goal is to help editorial teams work more efficiently, freeing up time for core journalistic work.

According to OpenAI, the Academy was developed with critical industry issues in mind, including concerns about job displacement and the reliability of AI-generated content. The platform builds on existing partnerships with major publishers like News Corp and Hearst, with plans to expand the offering next year. These educational initiatives might also be an attempt to smooth over tensions in the industry - while OpenAI courts publishers with tools and training, it is simultaneously battling copyright lawsuits from major media companies like the New York Times and Ziff Davis.

Comment Source: OpenAI | Academy

Jonathan Kemper

Dec 18, 2025

AI in practice

OpenAI has started accepting submissions for ChatGPT apps, which will populate a new directory following a review process. These applications allow users to perform specific actions directly within conversations, such as ordering food. The directory is located in the Tools menu, and users can launch specific apps simply by using the "@" command. While a software development kit (SDK) is currently available in beta, the first batch of tested applications is scheduled to launch in early 2026.

On the security front, OpenAI requires that apps remain suitable for general audiences and request only essential user information. At this stage, developers can link from their ChatGPT apps to external websites or native apps to complete transactions for physical goods. However, the company is exploring additional monetization options—including for digital goods—and notes that it has been collaborating with PayPal for several months. This rollout follows October's Dev Day, where OpenAI introduced the Apps SDK alongside its AgentKit for autonomous AI agents.

Comment Source: OpenAI

Matthias Bastian

Dec 17, 2025

AI in practice

Google makes Gemini 3 Flash the default for search and slashes reasoning costs

Maximilian Schreiner

Dec 17, 2025

AI in practice

Amazon is reportedly in talks to invest at least $10 billion in OpenAI. According to three people familiar with the discussions who spoke to The Information, the deal would push OpenAI's valuation past $500 billion. The influx of cash is intended to help OpenAI cover its massive server costs, including a recently agreed-upon $38 billion deal with Amazon Web Services (AWS). As part of the arrangement, OpenAI would commit to using Amazon's proprietary "Trainium" AI chips rather than relying solely on Nvidia hardware.

The companies are also discussing the possibility of turning ChatGPT into a shopping platform. However, Microsoft's exclusive rights to sell OpenAI models to cloud customers could limit Amazon's options here. Talks reportedly began in October following OpenAI's corporate restructuring but haven't concluded yet. OpenAI remains in urgent need of capital, as the company expects to burn through more than $100 billion over the next four years.

Comment Source: The Information

Matthias Bastian

Dec 16, 2025

AI in practice

The experimental productivity assistant called CC comes from Google Labs and runs on Gemini. After signing up, CC connects to Gmail, Google Calendar, Google Drive, and the internet to understand your daily routine. AI agents with access to private data like this raise familiar security concerns.

Every morning, CC sends an email summary called "Your Day Ahead." It pulls together your appointments, important tasks, and relevant updates, like upcoming bills or deadlines. The agent can also draft emails and create calendar entries when needed. Users control CC by replying to its emails, sharing preferences, or asking it to remember ideas and tasks.

CC is launching as an early test for users 18 and older in the US and Canada. You'll need a personal Google account plus a subscription to Google AI Ultra or another paid service. Those interested can sign up for the waitlist on the Google Labs website.

Comment Source: Google

Matthias Bastian

Dec 16, 2025

AI in practice

Google has released an update for Gemini 2.5 Flash Native Audio that makes voice assistants more capable. The model now handles complex workflows better, follows user instructions more precisely, and conducts more natural conversations. Compliance with developer instructions jumped from 84 to 90 percent, and call quality in multi-step conversations has also improved.

According to Google, the updated audio model scores 71.5 percent accuracy on function calls in the ComplexFuncBench benchmark, putting it ahead of OpenAI's gpt-realtime at 66.5 percent. It's worth noting, though, that Google likely didn't test against the latest realtime version, which OpenAI released just yesterday.

The update is now available in Google AI Studio, Vertex AI, Gemini Live, and Search Live. Google Cloud customers are already using the technology, and developers can test the model through the Gemini API.

Comment Source: Google

Matthias Bastian

Dec 16, 2025

AI in practice

OpenAI's new ChatGPT image model matches Google's Nano Banana Pro on complex prompts

Maximilian Schreiner

Dec 16, 2025

AI in practice

OpenAI has updated its Realtime API with three new model snapshots designed to improve transcription, speech synthesis, and function calling. According to developers, the gpt-4o-mini-transcribe variant significantly reduces hallucinations. For text-to-speech tasks, gpt-4o-mini-tts cuts the word error rate by 35 percent. The gpt-realtime-mini model, which targets voice assistants, follows instructions 22 percent more accurately and improves function calling by 13 percent.

? New audio model snapshots are now live in the Realtime API with improvements to reliability, lower error rates, and fewer hallucinations:

- gpt-4o-mini-transcribe-2025-12-15: 89% reduction in hallucinations compared to whisper-1

- gpt-4o-mini-tts-2025-12-15: 35% fewer word... pic.twitter.com/E8clreR1R0

- OpenAI Developers (@OpenAIDevs) December 15, 2025

OpenAI also explicitly mentioned improvements for Chinese, Japanese, Indonesian, Hindi, Bengali, and Italian.

Comment

Maximilian Schreiner

Dec 16, 2025

AI in practice

Nvidia is taking over software provider SchedMD to expand its presence in open-source technology. On Monday, the company confirmed it will continue to distribute SchedMD's "Slurm" software as an open-source product. The platform helps plan large-scale computing tasks in data centers, ensuring server capacity is used efficiently.

Nvidia views the technology as critical infrastructure for generative AI, noting that developers rely on it to train models. Financial terms of the deal were not disclosed. Founded in California in 2010, SchedMD employs around 40 people and serves clients like cloud provider CoreWeave and the Barcelona Supercomputing Center.

Comment Source: Nvidia