Content
summary Summary

Video AI startup RunwayML introduces two new features for its video generator. The company is also aiming higher with its long-term world model research project.

Ad

With "Text-to-Speech", Runway implements synthetic voices in the video editor. The company offers different voices to choose from that follow certain characteristics such as young, mature, female, male, etc. This feature is available on all plans.

Video: Runway via X

Another new feature is the Ratio function, which allows you to convert a created video into different formats, such as 1:1 or 16:9, with a single click. This makes it easier to create videos for different channels.

Ad
Ad

Video: Runway via X

General world models for better videos - and beyond

Runway also announced a new research initiative: The company wants to develop what it calls "world models." World models are intended to advance AI through systems that can understand and simulate the visual world.

A world model is an AI system that develops an internal representation of an environment to simulate future events in that environment. The goal of a general world model is to map and simulate real-world situations and interactions.

An example of such a model is Wayve's GAIA-1, which was developed from visual and textual data to control autonomous vehicles based on an understanding of the environment. However, this scenario is limited and controlled.

A video model like Gen-2 can be considered a "very early and limited" world model because it has developed a basic understanding of physics and motion for video generation, Runway writes. However, according to the company, it is still limited in its capabilities and has problems with complex camera or object motion.

Recommendation

Runway is currently working on several research challenges, including developing models that can produce consistent maps of the environment and realistic models of human behavior.

Video: Runway

Meta's head of AI research, Yann LeCun, agrees that AI first needs a world model and a basic understanding of the world to make significant progress. Language, as in today's large language models, is not sufficient as a knowledge base to achieve human-like AI.

The Runway research project, which is based on multimodal training, i.e. text, audio, image, video and other data points, is moving in a similar direction as multimodal becomes the new norm in AI mode development.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • AI startup RunwayML introduces two new features for its video generator: "Text-to-Speech" for synthetic voices and "Ratio" for easy conversion of video formats.
  • RunwayML also announces a research initiative to develop "world models" to drive AI systems that can understand and simulate the visual world.
  • The company is addressing research challenges such as developing models that can create consistent environmental maps and realistic models of human behavior.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.