AI research

DreaMoving: Alibaba shows off what is essentially a TikTok generator

Maximilian Schreiner

Alibaba

Alibaba introduces a new TikTok generator: DreaMoving can create personalized dance videos through image or text prompts.

The system is based on diffusion models and uses Video ControlNet and a Content Guider. The Video ControlNet controls the generation along the specified animation. The Content Guider is responsible for controlling the content of the generated videos, including the appearance of people and backgrounds.

Image: Alibaba

DreaMoving also integrates motion blocks into both the Denoising U-Net and ControlNet to improve temporal consistency and motion fidelity. Users can use text or image prompts to control the desired look and feel of the video.

DreamMoving learns from 1,000 dance videos

The DreaMoving system was trained on over 1,000 dance videos, which were divided into short clips of 8 to 10 seconds to ensure continuous images without transitions and special effects. For the individual frames of the clips, the team used MiniGPT-v2 to provide the captions necessary for the multimodal training.

Video: Alibaba

Video: Alibaba

Video: Alibaba

Thanks to the training and the customized architecture, DreaMoving is able to generate realistic videos from text input, images or a combination of both. For example, the system can generate videos of a specific person wearing a specific piece of clothing provided by the user via an image.

Video: Alibaba

In addition to DreamMoving, Alibaba recently introduced a similar system called Animate Anyone, which can create videos of animated people beyond just dancing. A similar system, MagicAnimate, is also available from TikTok's Bytedance.

Further examples and information can be found on the DreaMoving project page. There is also a demo on HuggingFace where you can upload faces and animations or choose from a preselection.

Sources: