Content
summary Summary

MiniMax has introduced Hailuo 02, the second generation of its video AI model, with major upgrades in both performance and price.

Ad

The new model uses an architecture called Noise-aware Compute Redistribution (NCR), which MiniMax says improves training and inference efficiency by a factor of 2.5. The NCR architecture handles long video sequences differently depending on the stage of training. Early in training, when artificial noise is heavily introduced into the data, videos are compressed as much as possible. Later, when the training videos are clearer, the model processes them at full resolution.

Minimax-Diagramm: Gemeinsames Training von Downsample und Re-noise zur frühen Kompression ultra-langer Video-Tokens.
MiniMax spotlights its new NCR architecture as key to Hailuo 02, but has yet to share technical details. | Image: MiniMax

Compared to the previous version, Hailuo 02 has three times more parameters and four times more training data, with MiniMax also noting improvements in data quality and diversity. The company hasn't disclosed exact parameter counts or dataset sizes.

According to MiniMax, Hailuo 02 shows clear gains in handling complex prompts and simulating physical processes. The company claims it's currently the only model able to accurately generate intricate scenes like gymnastics routines.

Ad
Ad

Video: MiniMax

Hailuo 02 is available in three variants: 768p for six seconds, 768p for ten seconds, and 1080p for six seconds. The previous model was limited to 720p, six-second videos at 25 fps.

In the Artificial Analysis Video Arena benchmark, where users rate videos from competing AI models, Hailuo 02 finished second in the image-to-video category. It placed just behind Bytedance's Seedance and ahead of Google's much-hyped Veo 3. However, this version of Veo 3 does not support audio, which is a major part of its appeal.

Table of leading image-to-video AI models with ELO scores: Seedance 1.0 leads with 1351 points, 95% CI shown.
In user benchmarks, Hailuo 02 outperforms Google Veo 3, even though Veo also supports native audio generation. | Screenshot: THE DECODER

Since its demo launch in August last year, people have created over 3.7 billion videos using the Hailuo platform, according to MiniMax. The company describes its initial rollout as very random, but says it quickly attracted widespread attention from creators worldwide.

The model can be accessed via web interface, mobile app, or API. For API users, generating a six-second 768p video costs $0.28, while a 1080p version costs $0.49. By comparison, producing an eight-second 1080p video with Google Veo 3 can cost around $3, depending on the plan.

Recommendation

MiniMax says it's working to improve generation speed, stability, and add new features beyond the current text-to-video and image-to-video options. Competing platforms like Runway already offer more advanced capabilities, such as tracking shots.

The Hailuo 02 release is part of "MiniMax Week," a five-day event during which the Chinese startup also unveiled an open-source language model, MiniMax-M1, complete with parameter counts and a technical paper. In contrast, technical details about Hailuo 02's training architecture remain undisclosed.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • MiniMax has introduced its Hailuo 02 video AI model, which uses a special architecture to increase training and inference efficiency by a factor of 2.5 and implement complex prompts and physical processes better than its predecessor.
  • The model is available in three variants up to 1080p and six seconds video length; users have generated more than 3.7 billion videos since the market launch. In user reviews, Hailuo 02 performs better than Google Veo 3.
  • Hailuo 02 is available via web, app and API and costs 0.49 US dollars per six seconds of 1080p video—significantly cheaper than some competing offerings.
Sources
Jonathan writes for THE DECODER about how AI tools can make our work and creative lives better.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.