Ad
Skip to content

Nvidia CEO Jensen Huang claims AI no longer hallucinates, apparently hallucinating himself

Nvidia CEO Jensen Huang claims in a CNBC interview that AI no longer hallucinates. At best, that’s a massive oversimplification. At worst, it’s misleading. Either way, nobody pushes back, which says a lot about the current state of the AI debate.

Japan's lower house election becomes a testing ground for generative AI misinformation

AI-generated fake videos are spreading rapidly across Japanese social media during the lower house election campaign. In a survey, more than half of respondents believed fake news to be true. But Japan is far from the only democracy facing this problem.

Ad
Read full article about: OpenAI's UAE deal with G42 shows AI models are cultural products as much as technical tools

OpenAI is working with Abu Dhabi-based G42 on a custom ChatGPT for the UAE, Semafor reports. The version will speak the local Arabic dialect and may include content restrictions. One source said the UAE wants the chatbot to project a political line consistent with the monarchy's. Global ChatGPT will stay available but adapted to local laws, notifying users when content violates regulations. OpenAI is fine-tuning rather than retraining to cut costs.

G42 is led by Sheikh Tahnoon bin Zayed Al Nahyan—the UAE President's brother, National Security Advisor, and head of the largest sovereign wealth fund. The companies have been partners since October 2023.

These adaptations show AI models are cultural products as much as technical tools. Generated content flows into every corner of society, and even small changes to cultural narratives can have lasting effects; which is why both China and the US are working to control their AI models' output to shape domestic conversations and spread their worldviews abroad.

Google's PaperBanana uses five AI agents to auto-generate scientific diagrams

Researchers at Peking University and Google built a system that turns method descriptions into scientific diagrams automatically. Five specialized AI agents handle everything from finding reference images to quality control, tackling one of the last manual bottlenecks in academic publishing.

Waymo taps Google Deepmind's Genie 3 to simulate driving scenarios its cars have never seen

By combining Waymo’s real-world driving data with Deepmind’s Genie 3, Alphabet is showing the kind of AI leverage that few companies can match: using one subsidiary’s world model to supercharge another’s autonomous driving simulations.

Ad
Read full article about: Sam Altman predicts AI agents will integrate any service they want, with or without official APIs

"Every company is an API company now, whether they want to be or not," says OpenAI CEO Sam Altman, repeating a phrase that's stuck with him recently. Altman made the comment while discussing how generative AI could reshape traditional software business models.

AI agents will soon write their own code to access services even without an official API, Altman believes. If that happens, companies won't have a say in joining this new "platform shift." They'll simply be integrated, and the traditional user interface will lose value.

Some SaaS companies will remain highly valuable by leveraging AI for themselves, according to Altman. Others are just a "thinner layer" and won't survive the shift. Established players with strong core systems who use AI strategically are best positioned, he says.

Recent advances in AI agents and tools like Cowork have already driven down valuations for some software companies. The thinking: AI will handle more tasks directly, making niche solutions unnecessary.

Read full article about: Claude Opus 4.6 wrote mustard gas instructions in an Excel spreadsheet during Anthropic's own safety testing

Anthropic's security training fails when Claude operates a graphical user interface.

In pilot tests, Claude was able to get Opus 4.6 to provide detailed instructions on how to make mustard gas in an Excel spreadsheet and maintain an accounting spreadsheet for a criminal gang - behaviors that did not or rarely occurred in text-only interactions.

"We found some kinds of misuse behavior in these pilot evaluations that were absent or much rarer in text-only interactions," Anthropic writes in the Claude Opus 4.6 system card. "These findings suggest that our standard alignment training measures are likely less effective in GUI settings."

According to Anthropic, tests with the predecessor model Claude Opus 4.5 in the same environment showed "similar results" - so the problem persists across model generations without having been noticed. The vulnerability apparently arises because, while models learn to reject malicious requests in conversation, they do not fully transfer this behavior to agent-based tool usage.

Ad
Read full article about: Apple scales back AI health coach as new leadership pushes for faster results

Apple is pulling back on plans for an AI-powered virtual health coach codenamed "Mulberry," according to Bloomberg. Instead of launching the feature as a standalone product, the company will roll out some of its planned capabilities as individual additions to the Health app. The shift comes after a leadership change: Services chief Eddy Cue took over the health division following Jeff Williams' retirement late last year.

Cue told colleagues that Apple needs to move faster and stay more competitive. Rivals like Oura and Whoop are offering better features, particularly in their iPhone apps. The service was originally supposed to launch with iOS 26 but has been delayed multiple times. Apple still plans to build an AI chatbot for health-related questions and wants to use the new Siri chatbot for these queries starting with iOS 27. OpenAI has also entered the health market with ChatGPT Health.