Content
summary Summary

On Twitter, Tesla's AI team is sharing its plans for foundation models for autonomous robots like the Tesla Bot.

Ad

Tesla's goal with the Tesla Bot is to create a universal, autonomous, bipedal humanoid robot capable of performing dangerous, repetitive, or boring tasks. Like other robotics projects, Tesla hopes to achieve this goal by using foundation models for autonomous robots.

Such models are trained on large amounts of data, and their general capabilities form the basis for specialized applications. In computational linguistics, GPT-4 is an example of such a model.

Tesla relies on big (video) data

For the robotic models, Tesla plans to rely on multimodal neural networks already used in Tesla's autonomous driving vehicles. These currently process multiple modalities such as camera video, maps, navigation, IMU (Inertial Measurement Unit), or GPS to predict whether there are vehicles, cyclists, people, or other objects in the way.

Ad
Ad

Video: Tesla

According to Tesla's AI team, these networks could also be used for collision avoidance in any robot. All the data from the entire fleet is also used to reconstruct sections of the road on which the AI can be further trained. In addition, the team is developing generative models that can, for example, produce short new video clips in which the vehicle behaves differently based on diverse real-world data.

Video: Tesla

This increases the amount of data available - a basic requirement for foundation models. A short clip also shows how a Tesla bot or similar system collects data in offices.

Video: Tesla

Recommendation

Video foundation models as the "brains" of the Optimus bot

Together, they will create video foundation models that form the "brains" of cars and robots. Google is also experimenting with such foundation models for robots and has shown that they can be used to build better robots with its multimodal Robotic Transformer.

Tesla has a clear data advantage, at least in the area of autonomous driving, and could also collect the data necessary for foundation models for robots with the Optimus robot planned for mass production.

To do this, it needs computing power, and Tesla wants to get its own Dojo supercomputer up to 100 exaflops by October 2024, the equivalent of about 400,000 Nvidia A100 GPUs. The interesting insight into their plans, however, is mainly an attempt to recruit the experts Tesla is desperately looking for.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Tesla plans to develop video foundation models for autonomous robots, such as the Tesla Bot.
  • Tesla plans to develop these models based on the multimodal neural networks already used in its autonomous driving vehicles to process data such as camera video, maps, navigation and more.
  • To provide the computing power for these ambitious plans, Tesla plans to bring its Dojo supercomputer up to 100 exaflops by October 2024.
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.