Ad
Skip to content
Read full article about: New Artificial Analysis benchmark shows OpenAI, Anthropic, and Google locked in a three-way tie at the top

Artificial Analysis just released version 4.0 of its Intelligence Index, ranking AI models across multiple benchmarks. OpenAI's GPT-5.2 at its highest reasoning setting takes the top spot, with Anthropic's Claude Opus 4.5 and Google's Gemini 3 Pro close behind.

The index scores models across four equally weighted categories: Agents, Programming, Scientific Reasoning, and General. Results are less saturated this time, with top models peaking at 50 points compared to 73 in the previous version.

Artificial Analysis Intelligence Index v4.0: GPT-5.2 (xhigh) leads with 50 points, followed by Claude Opus 4.5 (49) and Gemini 3 Pro Preview (48). It's a tight race at the top. | Image: Artificial Analysis
At the top of the cost table is GPT-5.2 (xhigh) with a total cost of $2,322, followed by Grok 4 ($1,574) and Claude 4.5 Opus ($1,510). Gemini 3 Pro Preview trails significantly behind at $988. | Image: Artificial Analysis

The updated index swaps three older tests (AIME 2025, LiveCodeBench, and MMLU-Pro) for a fresh set: AA-Omniscience checks model knowledge across 40 topics while flagging hallucinations, GDPval-AA tests models on practical tasks across 44 professions, and CritPt tackles physics research problems. Artificial Analysis says it ran all tests independently using a standardized approach, with full details available on its website.

Read full article about: China investigates Meta's Manus acquisition for export control violations

China's Ministry of Commerce is looking into whether Meta's purchase of AI startup Manus, valued at $2 billion or more, violated export control rules. According to the Financial Times, authorities want to know if the relocation of Manus employees and technology to Singapore, followed by the sale to Meta, should have required an export license.

The company's core team relocated to Singapore in the summer of 2025 to distance itself from China-related geopolitical risks. The Beijing offices have sat empty ever since. All three founders, Red Xiao, Peak Ji, and Tao Zhang, also moved from China to Singapore.

The relocation came after a $75 million funding round led by US firm Benchmark. That investment triggered its own set of questions, but from the opposite direction. The US Treasury Department investigated whether American money was flowing into a Chinese AI company without proper authorization. Meta says there were no longer any Chinese ownership stakes in Manus by the time the deal closed. The founders had previously turned down investment offers from local Chinese government entities.

Read full article about: AI industry finds its 2026 narrative as OpenAI and Microsoft argue users are the bottleneck, not models

The AI industry seems to have found its narrative for 2026: AI models are more capable than the people using them. Following Satya Nadella, OpenAI product head Fidji Simo has now weighed in. Her message: "AI models are capable of far more than how most people experience them day to day." OpenAI's goal for 2026 is closing the gap between what AI can do and how people actually use it. The company that turns research into useful products will lead the market.

ChatGPT has over 800 million weekly active users and one million business customers, according to Simo. In 2026, OpenAI plans to evolve it from a chatbot to a more personal "super assistant," one that understands goals, stores context, and helps proactively. A leaked mid-2025 document describes how such a super assistant would compete for human attention.

For businesses, OpenAI wants to build an automated workflow platform, with Codex serving as an "automated teammate" for developers. To justify higher prices, OpenAI needs major AI agent improvements: the company is reportedly considering plans costing up to 20,000 dollars per month.

Ad

Boston Dynamics unveils production Atlas designed for warehouses and factory floors

Boston Dynamics is turning its humanoid robot Atlas into a commercial product. The first fleet ships to Hyundai in 2026, where the 1.9-meter-tall robot will handle heavy lifting in warehouses and factories.

Ad
Read full article about: OpenAI loses top AI researcher Jerry Tworek after seven years

OpenAI is losing yet another senior researcher: Jerry Tworek is out after nearly seven years at the company. Tworek shared the news in a message to his team. He was a key player in building GPT-4, ChatGPT, and OpenAI's first AI coding models, while also helping push new scaling boundaries. Most recently, he ran the "Reasoning Models" team, working on AI systems that can handle complex logical reasoning. He was part of the core group behind the o1 and o3 models, the foundation for much of OpenAI's recent AI progress.

Tworek says he wants "to try and explore types of research that are hard to do at OpenAI." That sounds like a not-so-subtle dig at CEO Sam Altman's relentless focus on products and revenue, which has reportedly been causing tension among researchers. No word yet on where Tworek is headed next.

Read full article about: Abu Dhabi's TII claims its Falcon H1R 7B reasoning model matches rivals seven times its size

The Technology Innovation Institute (TII) from Abu Dhabi has released Falcon H1R 7B, a compact reasoning language model with 7 billion parameters. TII says the model matches the performance of competitors two to seven times larger across various benchmarks, though as always, benchmark scores only loosely correlate with real-world performance, especially for smaller models. Falcon H1R 7B uses a hybrid Transformer-Mamba architecture, which lets it process data faster than comparable models.

Falcon H1R 7B scores 49.5 percent across four benchmarks, outperforming larger models like Qwen3 32B (46.2 percent) and Nemotron H 47B Reasoning (43.5 percent). | Image: Technology Innovation Institute (TII)

The model is available as a complete checkpoint and quantized version on Hugging Face, along with a demo. TII released it under the Falcon LLM license, which allows free use, reproduction, modification, distribution, and commercial use. Users must follow the Acceptable Use Policy, which TII can update at any time.

Ad
Read full article about: More than five percent of ChatGPT messages worldwide are about health

More than five percent of all messages sent through ChatGPT worldwide deal with health topics. According to a report OpenAI shared exclusively with Axios, 40 million Americans use the chatbot daily for medical questions. Users ask it to explain medical bills, compare insurance plans, or check symptoms, often because they can't get in to see a doctor right away. OpenAI spotted this trend early and marketed GPT-5 as particularly capable for these kinds of use cases.

The report shows OpenAI now handles nearly two million insurance-related questions per week. The surge came after the Trump administration let long-standing health insurance subsidies expire at the start of the new year.

Using ChatGPT for medical advice comes with serious risks. The models still hallucinate, and many users likely rely on weaker model versions without reasoning capabilities, especially when chatting directly with the AI in voice mode, which uses a lighter model for faster responses. OpenAI's newly released promotional video doesn't mention any of these concerns.