Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Deep Learning based object Detection with YOLO v2
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Jumabek Alikhanov
September 29, 2019
Research
1
260
Deep Learning based object Detection with YOLO v2
I will briefly go through the the process of YOLOv2
Jumabek Alikhanov
September 29, 2019
Tweet
Share
Other Decks in Research
See All in Research
R&Dチームを起ち上げる
shibuiwilliam
1
160
データサイエンティストをめぐる環境の違い2025年版〈一般ビジネスパーソン調査の国際比較〉
datascientistsociety
PRO
0
700
POI: Proof of Identity
katsyoshi
0
140
第66回コンピュータビジョン勉強会@関東 Epona: Autoregressive Diffusion World Model for Autonomous Driving
kentosasaki
0
340
データサイエンティストの業務変化
datascientistsociety
PRO
0
220
世界モデルにおける分布外データ対応の方法論
koukyo1994
7
1.5k
超高速データサイエンス
matsui_528
2
380
2026年1月の生成AI領域の重要リリース&トピック解説
kajikent
0
280
A History of Approximate Nearest Neighbor Search from an Applications Perspective
matsui_528
1
160
Grounding Text Complexity Control in Defined Linguistic Difficulty [Keynote@*SEM2025]
yukiar
0
110
競合や要望に流されない─B2B SaaSでミニマム要件を決めるリアルな取り組み / Don't be swayed by competitors or requests - A real effort to determine minimum requirements for B2B SaaS
kaminashi
0
720
ウェブ・ソーシャルメディア論文読み会 第36回: The Stepwise Deception: Simulating the Evolution from True News to Fake News with LLM Agents (EMNLP, 2025)
hkefka385
0
150
Featured
See All Featured
Designing for Timeless Needs
cassininazir
0
130
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
720
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3.1k
The World Runs on Bad Software
bkeepers
PRO
72
12k
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
450
Groundhog Day: Seeking Process in Gaming for Health
codingconduct
0
93
jQuery: Nuts, Bolts and Bling
dougneiner
65
8.4k
How to audit for AI Accessibility on your Front & Back End
davetheseo
0
180
The Language of Interfaces
destraynor
162
26k
Site-Speed That Sticks
csswizardry
13
1.1k
Crafting Experiences
bethany
1
49
Transcript
Jumabek Alikhanov @Information Security Research Lab, Inha University YOLO9000: Better,
Faster, Stronger (CVPR 2017, Best Paper Honorable Mention) 1
1. Introduction & Previous Work 2. Better detection performance 3.
Faster processing speed 4. Detecting more classes(object types) 5. Conclusion CONTENTS 2
Task & Evaluation Metric mAP- mean Avarage Precision 3 https://github.com/rafaelpadilla/Object-Detection-Metrics
YOLO v1 Network Output shape = (S, S, B×5 +
C) = (7, 7, 2×5 + 20) = (7, 7, 30). 4
YOLOv1: Loss Function pi-conditional class Prob. Ci - box confidence
score 5 Localization Confidence Classification
Previously Pascal 2007 mAP Speed DPM v5 33.7 .07 FPS
14 s/img R-CNN 66.0 .05 FPS 20 s/img Fast R-CNN 70.0 .5 FPS 2 s/img Faster R-CNN 73.2 7 FPS 140 ms/img YOLO 63.4 45 FPS 22 ms/img 6
Previously Pascal 2007 mAP Speed DPM v5 33.7 .07 FPS
14 s/img R-CNN 66.0 .05 FPS 20 s/img Fast R-CNN 70.0 .5 FPS 2 s/img Faster R-CNN 73.2 7 FPS 140 ms/img YOLO 63.4 45 FPS 22 ms/img 7
Better Performance 8
9 YOLO Train on ImageNet Fine-tune on detection Resize network
10 Fine-tune 448x448 Classifier: +3.5% mAP Train on ImageNet Fine-tune
on detection Resize, fine-tune on ImageNet
Anchor boxes use static initialization
Use k-means clustering to find better initializations https://github.com/Jumabek/darknet_scripts
None
Static Anchors vs Dimension Clusters 14
Box Location Prediction 15
Dimension Clusters: +5% mAP
17 Multi-scale training: +1.5% mAP
YOLOv2: Fast, Accurate Detection
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object
detectors." arXiv preprint arXiv:1611.10012 (2016).
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object
detectors." arXiv preprint arXiv:1611.10012 (2016).
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object
detectors." arXiv preprint arXiv:1611.10012 (2016). YOLOv2
None
Faster Detection Speed 23
Speed is not just parameter counts or FLOPs Top 1
Top 5 FLOPs GPU Speed VGG-16 70.5 90.0 30.95 Bn 100 FPS Extraction (YOLOv1) 72.5 90.8 8.52 Bn 180 FPS Resnet50 75.3 92.2 7.66 Bn 90 FPS
Darknet19: A good balance of speed and accuracy Top 1
Top 5 FLOPs GPU Speed VGG-16 70.5 90.0 30.95 Bn 100 FPS Extraction (YOLOv1) 72.5 90.8 8.52 Bn 180 FPS Resnet50 75.3 92.2 7.66 Bn 90 FPS Darknet19 74.0 91.8 5.58 Bn 200 FPS
Why is it fast? Simple & efficient architecture C implementation
26
Stronger - Detecting more classes 27
- 14 million images - 22k classes - Classification labels
- 100k images - 80 classes - Detection labels Golden eagle
Typically use softmax over all classes
Can’t just mash classes together...
Can’t just mash classes together...
WordNet has structure but it’s messy
None
None
... Each node is a conditional probability
... Each node is a conditional probability P(Bedlington terrier) =
P(object) * P (living thing | object) * ….. P(canine | mammal) * P(dog | canine) * P(terrier | dog) * P(Bedlington terrier | terrier)
None
None
None
None
None
None
None
None
None
None
Conclusion • YOLOv2 and YOLO9000 real-time detection systems • YOLOv2
state of the art and faster than other systems • 9K object category detection by YOLO9000 47
1. CVPR paper - https://pjreddie.com/media/files/papers/YOLO9000.pdf 2. Article - https://medium.com/@jonathan_hui/real-time-object-detection-with-yolo-yolov2-28b1b93e2088 3.
Author’s Presentation - https://docs.google.com/presentation/d/14qBAiyhMOFl_wZW4dA1CkixgXwf0zKGbpw_0oHK8yEM/edit#slide=id.g1f9fb98e4b_0 _132 References 48