Ad
Short

OpenAI is bringing ChatGPT Voice directly into the main text chat, making it easier to switch between speaking and typing without jumping into a separate mode. Users can talk naturally, see responses as text, revisit earlier messages, and view visual content like images or maps without losing context. The update is rolling out on mobile and the web. Anyone who prefers the old setup can still turn the standalone voice mode back on in the settings under "Voice Mode."

Video: OpenAI

ChatGPT's Advanced Voice Mode has been available since September 2024. In June 2025, OpenAI boosted the voice system's expressiveness and added real-time translation.

Ad
Ad
Short

Deep Cogito positions its latest release as the "best open-weight LLM by a US company." Deep Cogito has released Cogito‑v2.1‑671B, a finetune built on a Deepseek base model from November 2024 (presumably Deepseek R1‑Lite, since Deepseek‑V3‑Base did not ship until December). After retraining the model internally, Deep Cogito says it now competes with top closed and open systems and outperforms other US open models like GPT‑OSS‑120B.

Balkendiagramm: Durchschnittlicher Tokenverbrauch pro Modell, Cogito v2.1 4894 am niedrigsten, Gemini 2.5 pro 9178 am höchsten.
The graph shows an average value of generated tokens per benchmark instance across all benchmarks.

According to Deep Cogito, Cogito v2.1's main advantage is efficiency. The model uses far fewer tokens on standard benchmarks than comparable systems, which can lower API costs. The team also trained it with process monitoring for thought steps, allowing it to reach conclusions with shorter reasoning chains. They report improvements in prompt-following, programming tasks, long-form queries, and creativity. Users can try the model for free through chat.deepcogito.com, where the developer says no chats are stored. The model weights are available on Hugging Face, and smaller editions are planned.

Ad
Ad
Short

Microsoft, Nvidia, and Anthropic have unveiled a set of new strategic partnerships valued at 45 billion dollars. Anthropic plans to scale its Claude models on Microsoft Azure and has agreed to purchase 30 billion dollars in Azure compute capacity, plus up to one gigawatt of additional capacity. As part of the deal, Nvidia and Anthropic are collaborating closely for the first time on model design and engineering, tuning Claude for Nvidia's architectures. The compute stack includes Nvidia's Grace Blackwell and Vera Rubin systems. Nvidia is investing up to 10 billion dollars in Anthropic, while Microsoft is investing up to 5 billion dollars.

Microsoft Foundry customers will gain access to Claude models like Claude Sonnet 4.5, Claude Opus 4.1, and Claude Haiku 4.5. With this move, Claude becomes the only top-tier model available across all three major cloud platforms. Microsoft also continues using Claude across its Copilot lineup, including GitHub Copilot and Microsoft 365 Copilot. CEOs Dario Amodei, Satya Nadella, and Jensen Huang introduced the partnerships in a ten-minute announcement video.

Short

Cloudflare is acquiring Replicate and folding its massive model library into Workers AI. The deal pushes Cloudflare's inference platform past 50,000 available models. Replicate users can keep using their existing APIs, while Workers AI users gain access to a far larger catalog along with new fine-tuning options. Both companies plan to bring Replicate's full library to Workers AI and let developers run their own models directly on Cloudflare's network.

Replicate has become a major hub for developers who want easy API access to AI models. Cloudflare brings its global network and serverless inference system to the table. "Together, we’re going to become the default for building AI apps," said Replicate cofounder Ben Firshman. Replicate will stay as an independent brand but operate with Cloudflare's support and infrastructure behind it.

Cloudflare, best known for its DNS services, recently introduced a system that blocks AI crawlers by default and gives website owners more control over how their content is accessed.

Ad
Ad
Short

Google Deepmind introduced WeatherNext 2, an upgraded version of its AI weather model that the company says outperforms the previous release across 99.9 percent of all meteorological variables and forecast ranges. The system delivers stronger results for core measurements like temperature, wind, and humidity for timeframes from zero to 15 days. According to Google, it also produces forecasts eight times faster and can generate outputs with resolutions as fine as one hour. The model can run hundreds of possible weather scenarios in under a minute on a single TPU, while traditional physics-based systems running on supercomputers would need hours to complete the same task.

Deepmind attributes the model's performance to a new technique called a Functional Generative Network, which injects perturbation signals directly into the architecture to keep predictions physically realistic. WeatherNext is already built into Google Search, Gemini, Pixel Weather, and the Weather API, and Google Maps integration is on the way.

Deepmind has been pushing hard on AI-driven weather research for years. In December 2024, the lab introduced GenCast, a diffusion-based model designed to further improve short-term and medium-range forecasting.

Google News