Content
summary Summary
Update
  • X user guizang.ai says he has gained access to the model through a smartphone app that requires a Chinese phone number.
  • On X, he shows a series of prompts and the resulting videos, which are of good quality but no longer than five seconds.
  • The user claims he wasn't picking cherries. All videos show the first result for each prompt.
  • Another user, Junie, says that it doesn't take more than 3 minutes to generate a video, and you can generate 5 clips at the same time.

Chinese tech company Kuaishou has unveiled KLING, a new video generation model. Based on the demos, it could rival OpenAI's Sora.

Kuaishou says KLING can make videos up to two minutes long at 1080p resolution and 30 frames per second. It can also model complex motion sequences that are physically accurate.

One video shows a two-minute train ride made with the prompt "Train ride with different landscapes seen through the window." OpenAI announced its video model Sora in mid-February, with relatively consistent videos up to one minute long.

Video: kling.kuaishou.com

Ad
Ad

Another example of a longer video shows a boy riding a bike in a garden as the seasons change. Of course, the landscapes change with the seasons, and maybe that's the trick to getting the length, but the boy on the bike looks pretty consistent. It would be more impressive if he rode around the same garden in circles, though.

Video: kling.kuaishou.com

A video of a boy eating a cheeseburger at a fast food restaurant is also noteworthy. The burger gets smaller after he takes the first bite.

Video: kling.kuaishou.com

A knife cutting an onion and a man eating pasta from a plate are similar examples of a physical interaction between two objects that causes a change in the video. However, these examples only last a few seconds, so it's not clear how consistent this "physical simulation" is.

Recommendation

Video: kling.kuaishou.com

The developers say KLING uses a 3D space-time attention system to better model motion and physical interaction, and the model is able to generate long, high-resolution videos thanks to a scalable framework and optimized inference.

Kuaishou claims the model correctly simulates the physical properties of the real world. Using a diffusion transformer, it can also combine concepts and create fictional scenes, such as a cat driving a car through a busy city.

Video: kling.kuaishou.com

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

OpenAI's Sora also uses a diffusion transformer and describes its video generator as a "world simulator"-though AI experts like Meta's Yann LeCun have criticized the startup for making such a bold claim.

KLING is currently available as a public demo in China. Kuaishou is a Beijing-based tech company best known in China for its social media apps. With KLING, it's now entering the race for large-scale generative AI models.

Tech investor and actor Ashton Kutcher has access to a beta version of Sora. He believes that generative AI for video will transform the film market and Hollywood.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Kuaishou, a Chinese technology company, has introduced KLING, an AI model for video generation that can produce videos of up to two minutes in length at 1080p resolution and 30 frames per second.
  • According to the company, KLING is able to model complex motion sequences in a physically correct way using a 3D space-time attention system. A "diffusion transformer" allows the combination of concepts and the generation of fictitious scenes that were not part of the training dataset.
  • The model is currently available as a public demo in China and could compete with OpenAI's Sora, which also claims to have developed a "world simulator" that can also generate videos.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.