YOLOv9 sets a new standard for real-time object recognition. It offers greater accuracy with less computation than previous models.
YOLO, short for "You Only Look Once," is an open-source image analysis AI that recognizes objects in real time. The software enables machines to "see" like humans and identify a wide variety of objects in images.
YOLO is highly accurate and runs on standard computer hardware. It supports functions such as object recognition, instance segmentation, and image classification.
YOLOv9 does more with less
YOLOv9 features two new techniques: Programmable Gradient Information (PGI) and Generalized Efficient Layer Aggregation Network (GELAN). PGI improves the network update for more accurate object recognition, while GELAN optimizes the network architecture to increase accuracy and speed. If you want to learn more about them, check out the paper.
Compared to YOLOv8, YOLOv9 reduces the number of parameters by 49 percent and the computational complexity by 43 percent, while increasing the Average Precision (AP) on the MS COCO dataset by 0.6 percent. Watch the video below to see how YOLOv9 compares to older YOLO models.
According to the developers, the flexibility of the GELAN architecture and the efficiency of PGI make it possible to adapt the models to the requirements of the inference systems without compromising performance.
Although YOLOv9 was developed specifically for object recognition, it can also be adapted to other machine vision tasks through improvements in the network architecture and training process.
The developers of YOLOv9, Chien-Yao Wang, I-Hau Yeh and Hong-Yuan Mark Liao, have published the source code on GitHub. Instructions for adapting YOLOv9 to your data are available here.