Content
summary Summary

Computer vision gives machines eyes that allow them to see the world similarly to humans. This enables many applications. The open source software YOLOv8 shows the current state of the art.

YOLO ("You only look once") is an open-source image analysis AI system developed by the computer vision community since 2015. Although it is very accurate, it is small and runs on commodity computer hardware, even a Raspberry Pi. YOLO has built-in support for object detection, instance segmentation, and image classification.

YOLOv8 is faster and more accurate than previous models

Compared to previous YOLO models, YOLOv8 is said to offer significant advances in image segmentation and object detection, especially in the more compact versions running on weaker hardware. For example, the smallest YOLOv8 model recognizes about 30 percent more objects in benchmarks than the smallest YOLOv5 version.

These objects include people, cars or baby carriages, but also details like flower pots, handbags, backpacks, or a knife at the vegetable stand in the marketplace.

Ad
Ad

The more, faster, powerful and reliable a CV system can detect and track objects in the environment, the more application scenarios are possible, e.g. for everyday robots or augmented reality headsets that need to navigate and understand their environment.

The performance of YOLOv8 compared to YOLOv5. | Image: Learn Open CV

YOLOv8 comes in five versions at the time of release (January 10, 2023). The smallest model, Nano, has a mean average object recognition precision (mAP) value of 37.3, and the largest, YOLOv8 Xtra Large, is 53.9.

The mAP value is a common metric in computer vision for evaluating the performance of object recognition algorithms. It indicates how well an algorithm correctly detects objects and distinguishes them from false alarms. A higher mAP value usually means better performance.

Advances in computer vision could impact our daily lives as much as image and language AI systems

Since the release of OpenAI's DALL-E 2 and GPT-3, discussions about advances in AI have focused on image and language models.

But YOLOv8 also shows that machine vision is constantly evolving and becoming more powerful. This potentially has as much or even more impact on our daily lives than language and image systems: utopian (like self-driving cars) or dystopian (ubiquitous surveillance, automated wars).

Recommendation

But check it out for yourselves: The following video documents the speed and precision of YOLOv8 in object detection and tracking.

What makes YOLO special, besides its performance, is the model's troubled history: original YOLO developer Joe Redmon stopped working on the software in 2020. The potential misuse of YOLO for military or surveillance applications was, in his view, "impossible to ignore," Redmon said at the time.

Redmon stopped working on YOLO with version 3 - but the CV community continued. The latest version, v8, comes from Ultralytics, a company that works with the US Intelligence Community (IC) and the US Department of Defense (DoD), among others.

YOLOv8 is freely available on Github for open source projects and academic applications. For commercial projects, a paid enterprise license via Ultralytics is required. Pricing is available upon request.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • YOLO is an open source computer vision software developed by the computer vision community since 2015.
  • The latest version v8 is faster and more accurate than previous versions. For example, it recognizes more objects in a scene and shows the current state of the art.
  • Computer vision, for example for AR headsets, robots or surveillance drones, potentially has as big an impact on our lives as AI language or image models.
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.