ByteDance's StoryMem gives AI video models a memory so characters stop shapeshifting between scenes
ByteDance tackles one of AI video generation’s most persistent problems: characters that change appearance from scene to scene. The new StoryMem system remembers how characters and environments should look, keeping them consistent throughout an entire story.
AI reasoning models think harder on easy problems than hard ones, and researchers have a theory for why
If I spent more time thinking about a simple task than a complex one—and did worse on it—my boss would have some questions. But that’s exactly what’s happening with current reasoning models like Deepseek-R1. A team of researchers took a closer look at the problem and proposed theoretical laws describing how AI models should ideally ‘think.’
Meta's Pixio proves simple pixel reconstruction can beat complex vision models
Less is more: Meta’s new image model, Pixio, beats more complex competitors at depth estimation and 3D reconstruction, despite having fewer parameters. The training method was considered outdated.
Meta brings Segment Anything to audio, letting editors pull sounds from video with a click or text prompt
Filtering a dog bark from street noise or isolating a sound source with a single click on a video: Meta’s SAM Audio brings the company’s visual segmentation approach to the audio world. The model lets users edit audio using text commands, clicks, or time markers. Code and weights are open source.