YOLO
You Only Look OnceA family of real-time object detection models that predict bounding boxes and class scores in a single forward pass.
In one line
One forward pass of a CNN predicts all bounding boxes and class probabilities at once — fast enough for video.
What it actually means
Earlier object detectors (R-CNN and friends) ran a two-stage pipeline: propose regions, then classify each. YOLO reframes detection as a single regression problem. The image is divided into a grid; each cell predicts boxes and class scores directly. The result is orders of magnitude faster — YOLO models run in real time on webcam video, even on CPU for small variants. The YOLO line has gone from v1 (Redmon 2015) through v3, v4, v5, v8, and now v11 from Ultralytics, each one a mix of architecture tweaks, better augmentation, and larger training sets.
Why it matters
For any real-time object detection, counting, or tracking job — retail analytics, drones, sports, industrial inspection — YOLO is the default. The Ultralytics Python package turned the whole family into a one-line API: load a pretrained model, fine-tune on your own data, export to ONNX or TensorRT. Newer detectors exist (DETR, RT-DETR) but YOLO still wins on the practical metric of “fastest way to ship a working detector”.
Example
from ultralytics import YOLO
model = YOLO("yolov8n.pt")
results = model("frame.jpg")
for box in results[0].boxes:
print(box.cls, box.conf, box.xyxy)
You’ll hear it when
- Building any real-time object detection system.
- Comparing speed/accuracy on the COCO benchmark.
- Fine-tuning a detector on a custom domain.
- Shipping CV to edge devices.