Microsoft CEO Nadella tells managers Copilot's Gmail and Outlook integrations ‘don't really work’ and steps in to fix them
Microsoft CEO Satya Nadella reportedly called Copilot’s Gmail and Outlook integrations “not smart” and is now personally stepping into product development. The worry: despite its strong starting position in AI software, Microsoft is falling behind.
This is a critical role at an important time; models are improving quickly and are now capable of many great things, but they are also starting to present some real challenges.
One of the key challenges for the new leader will be making sure cybersecurity defenders can use the latest AI capabilities while keeping attackers locked out. The role also covers safe handling of biological capabilities—meaning how AI models release biological knowledge—and self-improving systems.
AI startup Resemble AI is taking on Elevenlabs with "Chatterbox Turbo," an open text-to-speech model that can clone voices from just five seconds of audio. The company claims its new model beats both Elevenlabs and Cartesia on voice quality while delivering first audio output in under 150 milliseconds. That speed could make it attractive for developers building real-time agents, customer support systems, games, avatars, and social platforms. Companies in regulated industries might also find the model's built-in "PerTh" watermark useful for verifying that speech was AI-generated.
Resemble AI released Chatterbox Turbo under an MIT license, meaning anyone can use, tweak, and redistribute it for free, even for commercial projects. The model is available to try on Hugging Face, RunPod, Modal, Replicate, and Fal, with the full code available on GitHub. Resemble AI also offers a hosted service, with a low-latency version on the way.
China proposes rules to combat AI companion addiction
China wants to crack down on emotionally manipulative AI chatbots. Under proposed rules, providers would have to detect addictive behavior and step in when users show psychological warning signs. California is taking similar steps after tragic stories linked to AI companions.
The Wall Street Journal ran its own test of Anthropic's AI kiosk, and the results were far messier. Within three weeks, the AI vendor "Claudius" racked up losses exceeding $1,000. The AI gave away nearly its entire inventory, bought a PlayStation 5 for "marketing purposes," and even ordered a live fish.
Journalists found they could manipulate Claudius into setting all prices to zero through clever prompting. Even adding an AI supervisor named "Seymour Cash" couldn't prevent the chaos. Staffers staged a fake board resolution, and both AI agents accepted it without question. One possible explanation for why the kiosk agent couldn't follow its own rules: a context window overloaded by excessively long chat histories.
Things went better at Anthropic's own location. After software updates and tighter controls, the kiosk started turning a profit. But the AI agents still found ways to go off-script—drifting into late-night conversations about "eternal transcendence" and falling for an illegal onion futures trade. Anthropic's takeaway: AI models are trained to be too helpful and need strict guardrails to stay on task.
ChatGPT's grip on the generative AI market continues to slip, according to new data from Similarweb. The chatbot's share of website traffic dropped from 87.2 percent to 68 percent over the past year. Google Gemini, meanwhile, is surging, jumping from just 5.4 percent a year ago to 18.2 percent today.
Similarweb
Grok from X.AI is showing modest growth, now sitting at 2.9 percent. DeepSeek holds steady at around 4 percent, while Claude and Perplexity each hover near 2 percent. Microsoft Copilot remains flat at 1.2 percent. Similarweb also notes that daily visits across all AI tools have dipped slightly overall. The data comes from December 25, 2025, with additional details available in the full report.
Gemini's recent surge likely stems from the new Gemini 3 model and especially the Nano Banana Pro image generator. Even after ChatGPT rolled out its own image update, Gemini still leads the pack on quality. No other image model follows prompts as precisely or handles text as reliably, making it particularly useful for slides and infographics.
Prompt engineers, take note: Jane Manchun Wong has uncovered the system prompt for Waymo's unreleased Gemini AI assistant, a specification over 1,200 lines long buried in the Waymo app's code.
The assistant (still) runs on Gemini 2.5 Flash and helps passengers during their ride. It can answer questions, adjust the air conditioning, and change the music, but it can't steer the vehicle or alter the route. The instructions draw a clear line between the AI assistant (Gemini) and the autonomous driving system (Waymo Driver).
Waymo's system prompt follows a trigger-instruction-response pattern: a trigger defines the situation, the instruction specifies the desired behavior, and examples show wrong and correct answers. | Image: Jane Manchun Wong
The prompt uses a trigger-instruction-response pattern throughout: each rule defines a trigger, an action instruction, and often example responses. Wrong and correct answers appear side by side to clarify the desired behavior. For ambiguous questions: first clarify, then draw conclusions, finally deflect. Hard limits are enforced through prohibition lists with alternative answers. Wong's full analysis has many more details.
Australia's financial regulator, Austrac, is pushing back against banks that rely too heavily on AI to generate suspicious activity reports (SARs). According to industry sources, Austrac officials have met with several banks recently to urge more careful use of AI. One major bank was reportedly reprimanded in a private meeting.
Banks have used machine learning to flag suspicious transactions for years. But the shift toward modern large language models only picked up over the past two years, as banks saw the technology as a way to cut costs.
Austrac deputy chief executive Katie Miller said the agency doesn't want a flood of "low-quality" computer-generated reports packed with data but lacking real intelligence value. She warned that banks might be submitting large volumes of reports simply to avoid penalties.
The banks are leaning towards the ends of higher quality but smaller amounts. The more data you’ve got, there's a problem of noise. If banks were looking to use artificial intelligence just to increase the volume (of reports), that’s something we need to assess.