Ad
Short

A developer at OpenAI known as "Roon" on X explains why large language models never behave exactly the same way twice. Roon says a model's "personality" can shift with every training run, even if the dataset doesn't change. That's because the training process depends on random elements like reinforcement learning, so each run makes different choices in what's called "model space." As a result, every training pass produces slightly different behavior. Roon adds that even within a single training run, it's nearly impossible to recreate the same personality.

Video: via X

OpenAI tries to keep these "personality drifts" in check, since users often get attached to a model's unique quirks. This was especially true with the earlier "sycophancy" version of GPT-4o, which some users still miss. Roon, however, wasn't a fan. He even publicly wished for that "insufficiently aligned" model's "death" before deleting the post.

Ad
Ad
Short

Anthropic is ramping up its European operations, opening new offices in Paris and Munich to build on its presence in the region. These locations join existing hubs in London, Dublin, and Zurich, and will act as regional centers for sales, partnerships, and policy engagement.

The company says the EMEA market is its fastest-growing segment, with sales jumping more than ninefold over the past year. To support this growth, Anthropic is putting together a dedicated leadership team for Europe, naming Pip White to lead Northern Europe and Thomas Remy for Southern Europe.

Anthropic now operates twelve offices worldwide. In Europe, major customers like BMW, SAP, Sanofi, and Doctolib are already using the Claude AI model for software development, network management, and other tasks. Anthropic is also working with organizations such as TUM.ai and Unaite.

Ad
Ad
Short

Google has added a File Search Tool to the Gemini API, allowing developers to query their own documents using a vector database. The tool manages storing files, splitting them up, searching for relevant content, and inserting that information into Gemini's responses. Supported file types include PDF, DOCX, TXT, and JSON. The tool is free to use, except for a small fee when indexing data ($0.15 per million tokens). Source references are included automatically in the responses.

Google says the main use cases are internal search systems and chatbots that need to answer questions based on specific company documents. Developers can find more information in the documentation or test out the demo in Google AI Studio.

Ad
Ad
Short

Nvidia turns to synthetic data to tackle robotics’ biggest challenge: the lack of training data.

"We call this the big data gap in robotics," a Nvidia researcher said at the Physical AI and Robotics Day during GTC Washington. While large language models train on trillions of internet tokens, robot models like Nvidia’s GR00T have access to only a few million hours of teleoperation data, gathered through complex manual effort - and most of it is narrowly task-specific.

Nvidia’s answer is to rethink what it calls the "data pyramid for robotics." At the top sit real-world data - small in quantity and expensive to collect. In the middle lies synthetic data from simulation - theoretically limitless. At the base is unstructured web data. "When synthetic data surpasses the web-scale data, that's when robots can truly learn to become generalized for every task," the team states. With Cosmos and Isaac Sim, Nvidia aims to turn robotics’ data shortage into a compute challenge instead.

Google News