Short News Archive

Feb 6, 2026

Apple is pulling back on plans for an AI-powered virtual health coach codenamed "Mulberry," according to Bloomberg. Instead of launching the feature as a standalone product, the company will roll out some of its planned capabilities as individual additions to the Health app. The shift comes after a leadership change: Services chief Eddy Cue took over the health division following Jeff Williams' retirement late last year.

Cue told colleagues that Apple needs to move faster and stay more competitive. Rivals like Oura and Whoop are offering better features, particularly in their iPhone apps. The service was originally supposed to launch with iOS 26 but has been delayed multiple times. Apple still plans to build an AI chatbot for health-related questions and wants to use the new Siri chatbot for these queries starting with iOS 27. OpenAI has also entered the health market with ChatGPT Health.

Comment Source: Bloomberg

Matthias Bastian

Feb 5, 2026

Short News

OpenAI has released GPT-5.3-Codex, its latest coding model. The company says it combines GPT-5.2-Codex's coding capabilities with GPT-5.2's reasoning and knowledge, while running 25 percent faster. Most notably, on Terminal-Bench 2.0 it beats the just-released Opus 4.6 by 12 percentage points—a significant gap by current AI standards—while using fewer tokens than its predecessors. On OSWorld, an agentic computer-use benchmark, it scores 64.7 percent versus 38.2 percent for GPT-5.2-Codex. On GDPval, OpenAI's benchmark for knowledge-work tasks across 44 occupations, it matches GPT-5.2.

OpenAI also claims the model played a role in its own development, with the team using early versions to find bugs during training, manage deployment, and evaluate results. The company says the team was "blown away by how much Codex was able to accelerate its own development."

GPT-5.3-Codex is now available to paying ChatGPT users in the Codex app, CLI, IDE extension, and on the web. API access will follow. OpenAI has classified the model as its first with a "High" cybersecurity risk rating, though the company says this is precautionary, as there's no definitive proof such a classification is necessary.

Comment Source: OpenAI | System Card

Jonathan Kemper

Feb 5, 2026

Short News

Mistral AI launches Voxtral Transcribe 2, undercutting competitors on speech recognition pricing. The second-generation speech recognition models start at $0.003 per minute and, according to Mistral, outperform GPT-4o mini Transcribe, Gemini 2.5 Flash, and Deepgram Nova in accuracy. The model family comes in two variants: Voxtral Mini Transcribe V2 for processing larger audio files, and Voxtral Realtime for real-time applications with latency under 200 milliseconds. Voxtral Realtime costs twice as much and uses a proprietary streaming architecture that transcribes audio as it arrives - designed for voice assistants, live captioning, or call center analysis.

Both models support 13 languages, including German, English, and Chinese. New features include speaker recognition, word-level timestamps, and support for recordings up to three hours long. Voxtral Realtime is available as open-weights under Apache 2.0 on Hugging Face and via API, while Voxtral Mini Transcribe V2 is only accessible through Le Chat, the Mistral API, and a playground. Mistral released the first Voxtral generation in July 2025.

Comment Source: Mistral AI

Matthias Bastian

Feb 4, 2026

Short News

AI chip startup Cerebras Systems has closed a financing round of over one billion dollars. The funding values the company at around 23 billion dollars, according to a press release. Tiger Global led the round, with Benchmark, Fidelity, AMD, Coatue, and other investors participating.

Cerebras, based in Sunnyvale, California, builds specialized AI chips for fast inference - the speed at which AI models generate responses. The company's approach uses an entire wafer as a single chip, called the "Wafer Scale Engine" (WSE). Its current flagship is the WSE-3.

The recently announced deal with OpenAI, worth over ten billion dollars, likely helped attract investors. The AI lab plans to acquire 750 megawatts of computing capacity for ChatGPT over three years to speed up response times for its reasoning and code models. OpenAI is reportedly unhappy with Nvidia's inference speeds. Sam Altman recently promised "dramatically faster" responses when discussing the Codex code model—a promise likely tied to the Cerebras deal.

Comment Source: Cerebras

Matthias Bastian

Feb 4, 2026

Short News

Chinese company Kling has released video model 3.0. The new model is described as an "all-in-one creative engine" for multimodal creation. Key features include improved consistency for characters and elements, video production with 15-second clips and better control, and customizable multi-shot recording. Audio features now support multiple character references along with additional languages and accents. For image generation, Kling 3.0 offers 4K output, a new continuous shooting mode, and what the company calls "more cinematic visuals."

Ultra subscribers get exclusive early access through the Kling AI website. Official details on a general release, API access, or technical documentation aren't available yet. The Kling team published a paper on the Kling Omni models in December 2025. The YouTube channel "Theoretically Media" got early access and published a detailed first impression video. According to the channel, the model should roll out to other subscription levels within a week.

Comment Source: Kling AI

Matthias Bastian

Feb 4, 2026

Short News

Anthropic positions itself against advertising while exploring commercial chat transactions. In a blog post, the company says Claude will remain ad-free: no sponsored links, no advertiser-influenced responses. Unlike search engines, users often share personal information in AI chats, and advertising could push conversations toward transactions rather than helpfulness, concerns OpenAI CEO Sam Altman once shared before his company decided to pursue ads after all.

Expanding access to Claude is central to our public benefit mission, and we want to do it without selling our users’ attention or data to advertisers.

Instead, Anthropic plans to fund operations through enterprise contracts and subscriptions. The company is also exploring e-commerce transactions like bookings or purchases Claude handles for users. Anthropic could earn from these, similar to OpenAI's plans. However, the company says Claude's primary goal should always be providing helpful answers.

Anthropic's statement comes shortly after OpenAI revealed its ChatGPT advertising plans. The company even produced a video series poking fun at ChatGPT ads.

Comment Source: Anthropic

Matthias Bastian

Feb 4, 2026

Short News

Alibaba has released Qwen3-Coder-Next, a new open-weight AI model for programming agents and local development. Trained on 800,000 verifiable tasks, the model has 80 billion parameters total but only 3 billion active at any time. Despite this small footprint, Alibaba says it outperforms or matches much larger open-source models on coding benchmarks, scoring above 70 percent on SWE-Bench Verified with the SWE-Agent framework. As always, benchmarks only indicate real-world performance.

Performance on Coding Agent Benchmarks — Qwen3-Coder-Next competes with much larger models across multiple coding benchmarks while using only 3 billion active parameters. | Image: Qwen

The model supports 256,000 tokens of context and works with development environments like Claude Code, Qwen Code, Qoder, Kilo, Trae, and Cline. Local tools like Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers also support it. Qwen3-Coder-Next is available on Hugging Face and ModelScope under the Apache 2.0 license. More details in the blog post and technical report.

Comment Source: Qwen via X

Matthias Bastian

Feb 4, 2026

Short News

OpenAI has filled its "Head of Preparedness" position with Dylan Scandinaro, who previously worked on AI safety at competitor Anthropic. CEO Sam Altman announced the hire on X, calling Scandinaro "by far the best candidate" for the role. With OpenAI working on "extremely powerful models," Altman said strong safety measures are essential.

In his own post, Scandinaro acknowledged the technology's major potential benefits but "risks of extreme and even irrecoverable harm." OpenAI recently disclosed that a new coding model received a "high" risk rating in cybersecurity evaluations.

There’s a lot of work to do, and not much time to do it!
Dylan Scandinaro

Scandinaro's Anthropic background adds an interesting layer. The company was founded by former OpenAI employees concerned about OpenAI's product focus and what they saw as insufficient safety measures, and has since become known as one of the more safety-conscious AI developers. Altman says he plans to work with Scandinaro to implement changes across the company.

Comment Source: Altman via X | Scandinaro via X

Jonathan Kemper

Feb 4, 2026

Short News

Anthropic has announced two partnerships with major US research institutions to develop AI agents for biological research. The Allen Institute and the Howard Hughes Medical Institute (HHMI) will serve as founding partners in the initiative. According to Anthropic, "modern biological research generates data at unprecedented scale," but turning it into "validated biological insights remains a fundamental bottleneck." The company says manual processes "can't keep pace with the data being produced."

HHMI will develop specialized AI agents at the Janelia Research Campus that connect experimental knowledge to scientific instruments and analysis pipelines. The Allen Institute is working on multi-agent systems for data integration and experiment design that could "compress months of manual analysis into hours." According to Anthropic, these systems "are designed to amplify scientific intuition rather than replace it, keeping researchers in control of scientific direction while handling computational complexity."

The move extends Anthropic's push into scientific applications. The company recently launched Cowork, a feature designed for office work that gives Claude access to local files. OpenAI is also targeting the research market with Prism, an AI workspace for scientific writing.

Comment Source: Anthropic

Matthias Bastian

Feb 3, 2026

Short News

Google's Gemini models are outperforming the competition in board game benchmarks. Google Deepmind and Kaggle have expanded their "Game Arena" platform with two new games: Werewolf and Poker. The platform tests AI models across strategic games that measure different cognitive abilities—chess evaluates logical thinking, Werewolf tests social skills like communication and detecting deception, and Poker assesses how models handle risk and incomplete information.

These games provide objective ways to measure skills like planning and decision-making under uncertainty. Gemini 3 Pro and Gemini 3 Flash currently hold the top spots in all rankings. The Werewolf benchmark serves double duty for security research as well: it tests whether models can detect manipulation without any real-world consequences. According to Google Deepmind CEO Demis Hassabis, the AI industry needs more rigorous tests to properly evaluate the latest models.

Comment Source: Google Blog | Kaggle Blog