After text and images, AI is slowly conquering moving images. An ambitious AI director shares his first results and workflow on Reddit.
The YouTube channel "Machine_Mythos" is experimenting with text-to-video models and other AI tools to create short films with AI. For the short video "The Day Hell Froze Over," the AI director used a combination of AI-generated animated images and the Runway Gen-2 text-to-video model.
The following tutorial from Machine Mythos is for Runway Gen 2. The video was still subject to the 4-second limit for Runway videos, which has since been raised to 18 seconds. So creating coherent movies with longer scenes should be much easier than in the example above. Here's Machine Myth's basic workflow for The Day When Hell Froze Over.
I choose music first, usually, it helps me with the flow of the cuts. Also, make sure to carry over motion from previous shots. Well, this stuff is just basic direction, but it should be your first step before doing anything
Then I generate the image in Midjourney or Stable Diffusion and keep improving it until it's near perfect. Making the starting image accurate saves a lot of money and time. Use inpainting, outpainting, etc. All images should have the same anchor points for consistent looks, variations are a good way to get different angles of the same subject. This is necessary due to the 4-second limit.
I mostly use image prompts+text prompts because image prompts alone do not give enough motion and cannot be controlled. Keep spamming the preview button, it's free.
Look for any signs of motion, do not generate stuff where motion is less likely to happen. If you run into static stuff or stuff that looks terrible, just switch to a new strategy, no point hammering it.
From the resulting video, you can create screenshots and alter them in Photoshop to feed them back in again for consistency.
Apply interpolation, slow motion, reverse the footage, etc., there is very rarely a non-salvageable shot. You can alter the story to fit what is shown.
Machine Mythos mentions a very precise prompt description as another tip for visual consistency in Runway. Presumably due to the limited training set of the Runway model, this automatically leads to similar results, he believes. Other methods for improving consistency include image prompts and unique, categorical names for characters that can be used in subsequent prompts.
For upscaling, the AI director prefers "Topaz Labs," which can produce better results than Runway's direct upscaling, depending on the scene. Overall, Machine MythOS points out that patience and experimentation are key to the current AI filmmaking process.
Pika Labs: New text-to-video platform generates video sequences for sci-fi short film "The Last Artist"
For the latest AI short, "The Last Artist," the AI director used Pika Labs, a text-to-video platform currently in beta. Like Midjourney, it uses Discord as its user interface.
The command "/create" and a textual description creates a three-second video. You can add parameters like aspect ratio or motion intensity to fine-tune the prompt. The following video shows some sample videos created with Pika Labs.
According to Machine Myth, it is much easier to generate epic and visually stunning scenes than simple dialogue scenes or basic interactions. So for now, he expects to see mostly hybrid movies with filmed and generated scenes in the coming months or years. Eventually, he sees AI-generated content gaining an edge: "Only the highest quality human produced content will remain, but I think it will remain."
The video sequences of the following sci-fi short film, "The Last Artist," were entirely generated with Pika Labs.