OpenAI's Sora 2 can handle knowledge questions, too. In a test by Epoch AI, Sora got ten random tasks from the GPQA Diamond Multiple Choice Benchmark covering natural sciences. Sora scored 55 percent, while GPT-5 managed 72 percent. To run the test, Epoch AI asked Sora to make a video of a professor holding up the answer letter on a sheet of paper.
Video: via EpochAI
Epoch AI points out that an upstream language model could tweak the prompt before the video is created and slot in the answer along the way. Other systems, like HunyuanVideo, use similar re-prompting tricks, but it's not confirmed whether Sora does the same. Either way, the lines between text and video models are starting to blur.