Content
summary Summary

Many AI image generators already provide a powerful tool for modifying image content with text, called inpainting. Point-based editing makes adjustments even easier.

Ad

Researchers from Nanjing University and Tencent have developed a new AI-based image editing method called StableDrag that allows elements to be easily moved to new positions while maintaining the correct perspective, according to their paper.

The method builds on recent advances in AI image editing like FreeDrag, DragDiffusion, and Drag-GAN, and delivers significantly better results in benchmarks.

An iexample is changing the viewing direction of the "Mona Lisa" by moving her nose a little to the right. The input image with source (red) and destination (blue) is shown on the left, the result of DragDiffusion in the middle and StableDrag-Diff on the right.

Ad
Ad
Example of a Mona Lisa whose head is turned by AI image processing until she is looking head-on into the camera
Image: Cui et al.

The tool works well on photos, illustrations, and other AI-generated images, with human faces and subjects like cars, landscapes, and animals.

Image: Cui et al.
Image: Cui et al.
Image: Cui et al. | Image: Cui et al.
Image: Cui et al. | Image: Cui et al.

The key innovations are a point tracking method to precisely localize updated target points and a confidence-based strategy to maintain high image quality at each step, the researchers explain. The confidence value evaluates the editing quality and reverts to original image features if it drops too low, preserving the source material without limiting editing options.

Image: Cui et al. | Image: Cui et al.
Image: Cui et al. | Image: Cui et al.

While AI image generation from text has rapidly advanced, enabling highly realistic photos, image manipulation is still catching up in comparison. Some AI models offer inpainting to alter selected areas with text input, but StableDrag's point-based editing promises more precision. The researchers say they will open source the code soon.

Apple is taking a different manipulation approach with MGIE, which uses text prompts to add, remove or change objects without selecting specific regions.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Researchers at Nanjing University and Tencent have developed StableDrag, a new point-based image processing method that allows elements to be moved to a new position in the image with little effort while maintaining the correct perspective.
  • StableDrag builds on existing models such as FreeDrag, DragDiffusion and Drag-GAN, and delivers significantly better results in benchmarks when editing a wide range of subjects such as faces, cars, landscapes and animals.
  • This gives users even easier ways to manipulate images without having to rely on the inaccurate conversion of text prompts.
Sources
Jonathan works as a freelance tech journalist for THE DECODER, focusing on AI tools and how GenAI can be used in everyday work.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.