1X Technologies uses world models to optimize robot training

Sep 19, 2024

1X Technologies

Key Points

Norwegian startup 1X Technologies says it has made significant progress in developing AI-based world models for robots that serve as virtual simulators to test and improve the robots' abilities in a variety of scenarios.
The world models have been trained using thousands of hours of video footage collected by 1X of its EVE humanoid robots performing various tasks in homes and offices. This should allow the models to plausibly predict how objects and the environment will change in response to the robot's actions.
Despite some shortcomings, such as problems with consistently representing the color and shape of objects or correctly mapping physical laws, 1X sees these world models as a milestone in the development and training of universal robots. The company is providing datasets, pre-trained models, and prize money as part of a challenge.

Norwegian startup 1X Technologies claims to have made major progress in developing AI-based world models for robots.

According to 1X, these models act as virtual simulators, allowing robot capabilities to be tested and improved across various scenarios without real-world trials.

1X sees this as a potential solution to the "robotics problem" - the challenge of reliably evaluating robots trained for diverse tasks in ever-changing environments.

Example of a model that can fold T-shirts. Performance decreases over a 50-day period. | Image: 1X Technologies

1X says even identical robot models can show large performance fluctuations within days as environments change, making rigorous real-world evaluation frustratingly difficult.

To train its world models, 1X collected thousands of hours of video footage showing its humanoid EVE robots performing various tasks in homes and offices.

Using machine learning on this data, the models can now plausibly predict how objects and environments will respond to robot actions.

Even for actions not explicitly programmed, the model generates believable video output. For instance, it learns that people and objects should be avoided.

Video: 1X Technologies

Robots can fold T-shirts - most of the time

According to 1X, the models are already capable of complex physical interactions such as gripping and lifting objects, opening doors and drawers or handling deformable materials such as clothing, for example to fold T-shirts.

1X claims the models can already handle complex physical interactions like gripping objects, opening doors and drawers, or manipulating clothing to fold T-shirts.

The main value of the world model comes from simulating object interactions. In the following generations, we provide the model the same initial frames and three different sets of actions to grasp boxes. In each scenario, the box(es) grasped are lifted and moved in accordance with the motion of the gripper, while the other boxes remain undisturbed.

1X Technologies

1X admits some limitations. The models sometimes struggle to maintain consistent object colors and shapes, or to accurately model physics. Self-recognition in mirrors also remains unreliable.

Video: 1X Technologies

Still, 1X views these world models as a milestone for developing and training universal robots. To accelerate progress, the startup is offering datasets, pre-trained models and prize money through its "1X World Model Challenge."

World models promise greater efficiency in training

1X's long-term vision is to use world models directly for robot training, potentially enabling huge efficiency gains compared to real-world testing. The company is actively recruiting AI experts to pursue these goals.

Earlier this year, 1X raised $100 million to advance its humanoid household robot Neo toward market launch. The funding, backed by industry leaders like OpenAI, underscores high expectations for 1X's technology.

Besides 1X, Nvidia is heavily invested in humanoid robot development. The company recently unveiled a training approach using Apple Vision Pro. Nvidia researcher Jim Fan, who works on foundation robot models, expects a "GPT-3 moment" for robotics in coming years.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: 1X Technologies