YOLOv9 improves real-time object recognition accuracy with less computation

YOLOv9 sets a new standard for real-time object recognition. It offers greater accuracy with less computation than previous models.

YOLO, short for "You Only Look Once," is an open-source image analysis AI that recognizes objects in real time. The software enables machines to "see" like humans and identify a wide variety of objects in images.

YOLO is highly accurate and runs on standard computer hardware. It supports functions such as object recognition, instance segmentation, and image classification.

YOLOv9 does more with less

YOLOv9 features two new techniques: Programmable Gradient Information (PGI) and Generalized Efficient Layer Aggregation Network (GELAN). PGI improves the network update for more accurate object recognition, while GELAN optimizes the network architecture to increase accuracy and speed. If you want to learn more about them, check out the paper.

Compared to YOLOv8, YOLOv9 reduces the number of parameters by 49 percent and the computational complexity by 43 percent, while increasing the Average Precision (AP) on the MS COCO dataset by 0.6 percent. Watch the video below to see how YOLOv9 compares to older YOLO models.

According to the developers, the flexibility of the GELAN architecture and the efficiency of PGI make it possible to adapt the models to the requirements of the inference systems without compromising performance.

Although YOLOv9 was developed specifically for object recognition, it can also be adapted to other machine vision tasks through improvements in the network architecture and training process.

The developers of YOLOv9, Chien-Yao Wang, I-Hau Yeh and Hong-Yuan Mark Liao, have published the source code on GitHub. Instructions for adapting YOLOv9 to your data are available here.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

YOLOv9 improves real-time object recognition accuracy with less computation

YOLOv9 does more with less

Researchers introduce COLORBENCH to test color understanding in vision-language models

Deepseek's Janus Pro is a good upgrade, but it won't fuel a US AI 'Sputnik crisis'

Qwen's open-source QVQ rivals OpenAI and Google's best models in visual reasoning

OpenAI launches new ChatGPT agent that automates complex tasks for Pro, Plus, and Team

Kimi-K2 is the next open-weight AI milestone from China after Deepseek

New Energy-Based Transformer architecture aims to bring better "System 2 thinking" to AI models

YOLOv9 improves real-time object recognition accuracy with less computation

YOLOv9 does more with less

Researchers introduce COLORBENCH to test color understanding in vision-language models

Deepseek's Janus Pro is a good upgrade, but it won't fuel a US AI 'Sputnik crisis'

Qwen's open-source QVQ rivals OpenAI and Google's best models in visual reasoning