Ad
Skip to content
Read full article about: Google's new Gemini API Agent Skill patches the knowledge gap AI models have with their own SDKs

Google has built an "Agent Skill" for the Gemini API that tackles a fundamental problem with AI coding assistants: once trained, language models don't know about their own updates or current best practices. The new skill feeds coding agents up-to-date information about current models, SDKs, and sample code. In tests across 117 tasks, the top-performing model (Gemini 3.1 Pro Preview) jumped from 28.2 to 96.6 percent success rate. Skills were first introduced late last year by Anthropic and quickly adopted by other AI companies.

Success rates of Gemini models with and without the agent skill across 117 coding tasks. Newer models in the 3 series benefit far more from the skill than older models, which Google attributes to their stronger reasoning capabilities. | Image: Google

Older 2.5 models saw much smaller improvements, which Google says comes down to weaker reasoning abilities. Interestingly, a Vercel study suggests that giving models direct instructions through AGENTS.md files could be even more effective. Google is exploring other approaches as well, including MCP services. The skill is available on GitHub.

Read full article about: OpenAI sets two-stage Sora shutdown with app closing April 2026 and API following in September

OpenAI is killing Sora in two stages. The web and app version goes dark on April 26, 2026, with the Sora API following on September 24, 2026. OpenAI is urging users to download their content before the cutoff dates. Videos and images can be exported directly from the Sora library.

The company says it hasn't decided yet whether there will be a final export window after those dates. If one happens, users will get an email heads-up. Once all deadlines pass, user data gets permanently deleted. The shutdown also takes down the sora.chatgpt.com platform, which handled image and video generation. Full details are on OpenAI's help page under "What to know about the Sora discontinuation."

Sora's demise is part of a bigger strategic pivot. OpenAI wants to funnel compute toward coding tools and enterprise customers—a play that mirrors rival Anthropic—and a super app rolling ChatGPT and other tools into one package. Sora will stick around as a research project focused on world models, with the long-term goal of "automating the physical economy."

Read full article about: Google's new Gemini update makes it easy to import memories from ChatGPT and Claude

Google is borrowing Anthropic's memory import approach, letting Gemini users bring over saved reminders, preferences, and full chat histories from apps like ChatGPT and Claude. The process works by copying a suggested prompt into the previous AI app, generating a summary, and pasting it into Gemini, which saves the information in its own context. Users can also upload chat histories as a ZIP file (up to 5 GB) and continue previous conversations inside Gemini. Google is renaming "Past Chats" to "Memory," with the rollout happening gradually.

Google's new memory import feature in Gemini: users copy a prompt into their previous AI app, then paste the generated summary into Gemini. | Image: Google

Anthropic pioneered this approach after OpenAI drew criticism for a military deal Anthropic had turned down on ethical grounds. With users already looking to switch, Anthropic wanted to give them an extra reason to make the move. Both Google and Anthropic rely on the same basic method for data extraction—a simple prompt that asks the existing AI app to output everything it has stored about the user.

Read full article about: Cohere releases open source model that tops speech recognition benchmarks

Canadian AI company Cohere has released "Transcribe," a new open-source model for automatic speech recognition. The company says it claims the top spot on the Hugging Face Open ASR Leaderboard with an average word error rate of just 5.42 percent, beating out competitors like OpenAI's Whisper Large v3, ElevenLabs Scribe v2, and Qwen3-ASR-1.7B. Cohere says Transcribe also delivers the best throughput among similarly sized models.

The chart compares seven speech recognition models with more than one billion parameters. The x-axis shows accuracy as word error rate (WER), where lower values are better. The y-axis shows throughput (RTFx), measuring how fast a model processes audio relative to real time. Cohere Transcribe leads with an RTFx of 525 and a WER of about 5.4, making it both the fastest and most accurate model. NVIDIA Canary Qwen 2.5B follows with an RTFx of 418. Models like OpenAI's Whisper Large v3 and Voxtral Realtime are significantly slower and less accurate.

Cohere Transcribe compared with seven other speech recognition models. Models closer to the upper left corner perform best, meaning faster throughput and lower word error rates. | Image: CohereThe 2 billion parameter model supports 14 languages, including English, German, French, and Japanese. It's available for download under the Apache 2.0 license on Hugging Face and can also be accessed through Cohere's API and the Model Vault platform. Cohere plans to integrate Transcribe into its AI agent platform North in the future.

Anthropic leak reveals new model "Claude Mythos" with "dramatically higher scores on tests" than any previous model

Update: The leaked draft blog posts have surfaced online, revealing Anthropic’s plans for a new model class above its existing Opus line. The documents show two possible name candidates, details about a deliberately slow release strategy, and a strong focus on cybersecurity.

Read full article about: OpenAI's Codex gets a plugin marketplace for Slack, Notion, Figma, and more

OpenAI is adding plugins to Codex that integrate with popular work tools like Slack, Figma, Notion, Gmail, and Google Drive. The plugins go beyond coding - OpenAI says they also help with planning, research, and coordination. Under the hood, plugins bundle predefined prompt workflows ("skills"), app integrations, and MCP server configurations into installable packages, similar to ChatGPT integrations. They work across the Codex app, command line, and IDE extensions. Developers can build their own and distribute them through local or team-wide "marketplaces." An official curated directory is already live, with self-publishing coming soon.

The move is part of OpenAI's broader push into coding tools and enterprise customers, which includes a planned "super app" combining ChatGPT, Codex, and the Atlas browser. Codex now has over 1.6 million weekly active users, with a Windows version shipping just recently.

Read full article about: Apple gets full Gemini access and uses distillation to build lightweight on-device AI

Apple has secured broad access rights to Google's Gemini models. According to The Information, Apple has full access to Gemini within its own data centers and can use distillation to build smaller models from it. Gemini generates high-quality answers along with its chain of thought, which then serve as training data for a smaller model. In short, Apple is paying for what Chinese AI companies are allegedly doing in secret: tapping a powerful AI model to generate quality training data for a smaller one.

Because Apple has full access, it can build smaller versions that give the same answers as Gemini and arrive at them the same way. These lighter versions need far less processing power and can run directly on Apple devices.

Since Gemini is built for chatbots and enterprise applications, it doesn't always line up with Apple's plans for Siri, according to The Information. But Apple is still building its own models in parallel through its Apple Foundation Models team. New AI features could drop at Apple's developer conference in June.

Read full article about: Mistral's first open-weight TTS model Voxtral clones voices from three seconds of audio across nine languages

French AI startup Mistral has released Voxtral TTS, its first text-to-speech model. The model supports nine languages—including German, English, French, and Spanish—and is relatively compact at four billion parameters. Mistral says it produces realistic, emotionally expressive speech and can adapt to new voices from as little as three seconds of reference audio. Latency sits at 70 milliseconds for a typical setup with a 10-second speech sample and 500 characters.

In human comparison tests, Voxtral TTS scored higher on naturalness than ElevenLabs Flash v2.5 at a similar response time. That said, ElevenLabs has since shipped a newer model with v3. Voxtral TTS is available through an API at $0.016 per 1,000 characters, can be tested in Mistral Studio, and is also available as an open-weights version on Hugging Face.

Read full article about: OpenAI and Anthropic before the IPO: Different balance sheets make comparison difficult

Anthropic and OpenAI are both growing fast, but they report revenue very differently, The Information reports. OpenAI's annualized revenue is around $25 billionAnthropic's is $19 billion. Both calculate this similarly: four weeks of revenue times 13, with Anthropic adding monthly subscriptions times 12.

The key difference is how they handle cloud partners. OpenAI gives 20 percent of revenue to Microsoft and reports the number before that deduction. For Azure cloud sales, it only counts its 20 percent cut. Anthropic does the opposite: It books all cloud sales through AWS, Microsoft, and Google as its own revenue, listing the providers' shares as sales and marketing costs. Anthropic considers itself the primary provider, while OpenAI treats Microsoft as the primary provider for Azure.

Both follow US accounting rules (GAAP), but their numbers are difficult to compare. Anthropic's revenue likely looks higher on paper than it would under the same method. That matters as both companies head toward an IPO.