Computer vision gives machines eyes that allow them to see the world similarly to humans. This enables many applications. The open source software YOLOv8 shows the current state of the art.
YOLO ("You only look once") is an open-source image analysis AI system developed by the computer vision community since 2015. Although it is very accurate, it is small and runs on commodity computer hardware, even a Raspberry Pi. YOLO has built-in support for object detection, instance segmentation, and image classification.
YOLOv8 is faster and more accurate than previous models
Compared to previous YOLO models, YOLOv8 is said to offer significant advances in image segmentation and object detection, especially in the more compact versions running on weaker hardware. For example, the smallest YOLOv8 model recognizes about 30 percent more objects in benchmarks than the smallest YOLOv5 version.
These objects include people, cars or baby carriages, but also details like flower pots, handbags, backpacks, or a knife at the vegetable stand in the marketplace.
The more, faster, powerful and reliable a CV system can detect and track objects in the environment, the more application scenarios are possible, e.g. for everyday robots or augmented reality headsets that need to navigate and understand their environment.
YOLOv8 comes in five versions at the time of release (January 10, 2023). The smallest model, Nano, has a mean average object recognition precision (mAP) value of 37.3, and the largest, YOLOv8 Xtra Large, is 53.9.
The mAP value is a common metric in computer vision for evaluating the performance of object recognition algorithms. It indicates how well an algorithm correctly detects objects and distinguishes them from false alarms. A higher mAP value usually means better performance.
Advances in computer vision could impact our daily lives as much as image and language AI systems
Since the release of OpenAI's DALL-E 2 and GPT-3, discussions about advances in AI have focused on image and language models.
But YOLOv8 also shows that machine vision is constantly evolving and becoming more powerful. This potentially has as much or even more impact on our daily lives than language and image systems: utopian (like self-driving cars) or dystopian (ubiquitous surveillance, automated wars).
But check it out for yourselves: The following video documents the speed and precision of YOLOv8 in object detection and tracking.
What makes YOLO special, besides its performance, is the model's troubled history: original YOLO developer Joe Redmon stopped working on the software in 2020. The potential misuse of YOLO for military or surveillance applications was, in his view, "impossible to ignore," Redmon said at the time.
Redmon stopped working on YOLO with version 3 - but the CV community continued. The latest version, v8, comes from Ultralytics, a company that works with the US Intelligence Community (IC) and the US Department of Defense (DoD), among others.
YOLOv8 is freely available on Github for open source projects and academic applications. For commercial projects, a paid enterprise license via Ultralytics is required. Pricing is available upon request.