Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Deep Learning based object Detection with YOLO v2
Search
Jumabek Alikhanov
September 29, 2019
Research
1
260
Deep Learning based object Detection with YOLO v2
I will briefly go through the the process of YOLOv2
Jumabek Alikhanov
September 29, 2019
Tweet
Share
Other Decks in Research
See All in Research
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
1
100
LiDARセキュリティ最前線(2025年)
kentaroy47
0
130
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.1k
An Open and Reproducible Deep Research Agent for Long-Form Question Answering
ikuyamada
0
270
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
330
CoRL2025速報
rpc
4
4.2k
第二言語習得研究における 明示的・暗示的知識の再検討:この分類は何に役に立つか,何に役に立たないか
tam07pb915
0
1.1k
病院向け生成AIプロダクト開発の実践と課題
hagino3000
0
530
R&Dチームを起ち上げる
shibuiwilliam
1
160
SREはサイバネティクスの夢をみるか? / Do SREs Dream of Cybernetics?
yuukit
3
380
社内データ分析AIエージェントを できるだけ使いやすくする工夫
fufufukakaka
1
900
令和最新技術で伝統掲示板を再構築: HonoX で作る型安全なスレッドフロート型掲示板 / かろっく@calloc134 - Hono Conference 2025
calloc134
0
550
Featured
See All Featured
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.4k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
2.1k
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
62
50k
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
170
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
97
Embracing the Ebb and Flow
colly
88
5k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.3k
Build The Right Thing And Hit Your Dates
maggiecrowley
38
3k
How GitHub (no longer) Works
holman
316
140k
How to Grow Your eCommerce with AI & Automation
katarinadahlin
PRO
1
110
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
67
Lessons Learnt from Crawling 1000+ Websites
charlesmeaden
PRO
1
1.1k
Transcript
Jumabek Alikhanov @Information Security Research Lab, Inha University YOLO9000: Better,
Faster, Stronger (CVPR 2017, Best Paper Honorable Mention) 1
1. Introduction & Previous Work 2. Better detection performance 3.
Faster processing speed 4. Detecting more classes(object types) 5. Conclusion CONTENTS 2
Task & Evaluation Metric mAP- mean Avarage Precision 3 https://github.com/rafaelpadilla/Object-Detection-Metrics
YOLO v1 Network Output shape = (S, S, B×5 +
C) = (7, 7, 2×5 + 20) = (7, 7, 30). 4
YOLOv1: Loss Function pi-conditional class Prob. Ci - box confidence
score 5 Localization Confidence Classification
Previously Pascal 2007 mAP Speed DPM v5 33.7 .07 FPS
14 s/img R-CNN 66.0 .05 FPS 20 s/img Fast R-CNN 70.0 .5 FPS 2 s/img Faster R-CNN 73.2 7 FPS 140 ms/img YOLO 63.4 45 FPS 22 ms/img 6
Previously Pascal 2007 mAP Speed DPM v5 33.7 .07 FPS
14 s/img R-CNN 66.0 .05 FPS 20 s/img Fast R-CNN 70.0 .5 FPS 2 s/img Faster R-CNN 73.2 7 FPS 140 ms/img YOLO 63.4 45 FPS 22 ms/img 7
Better Performance 8
9 YOLO Train on ImageNet Fine-tune on detection Resize network
10 Fine-tune 448x448 Classifier: +3.5% mAP Train on ImageNet Fine-tune
on detection Resize, fine-tune on ImageNet
Anchor boxes use static initialization
Use k-means clustering to find better initializations https://github.com/Jumabek/darknet_scripts
None
Static Anchors vs Dimension Clusters 14
Box Location Prediction 15
Dimension Clusters: +5% mAP
17 Multi-scale training: +1.5% mAP
YOLOv2: Fast, Accurate Detection
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object
detectors." arXiv preprint arXiv:1611.10012 (2016).
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object
detectors." arXiv preprint arXiv:1611.10012 (2016).
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object
detectors." arXiv preprint arXiv:1611.10012 (2016). YOLOv2
None
Faster Detection Speed 23
Speed is not just parameter counts or FLOPs Top 1
Top 5 FLOPs GPU Speed VGG-16 70.5 90.0 30.95 Bn 100 FPS Extraction (YOLOv1) 72.5 90.8 8.52 Bn 180 FPS Resnet50 75.3 92.2 7.66 Bn 90 FPS
Darknet19: A good balance of speed and accuracy Top 1
Top 5 FLOPs GPU Speed VGG-16 70.5 90.0 30.95 Bn 100 FPS Extraction (YOLOv1) 72.5 90.8 8.52 Bn 180 FPS Resnet50 75.3 92.2 7.66 Bn 90 FPS Darknet19 74.0 91.8 5.58 Bn 200 FPS
Why is it fast? Simple & efficient architecture C implementation
26
Stronger - Detecting more classes 27
- 14 million images - 22k classes - Classification labels
- 100k images - 80 classes - Detection labels Golden eagle
Typically use softmax over all classes
Can’t just mash classes together...
Can’t just mash classes together...
WordNet has structure but it’s messy
None
None
... Each node is a conditional probability
... Each node is a conditional probability P(Bedlington terrier) =
P(object) * P (living thing | object) * ….. P(canine | mammal) * P(dog | canine) * P(terrier | dog) * P(Bedlington terrier | terrier)
None
None
None
None
None
None
None
None
None
None
Conclusion • YOLOv2 and YOLO9000 real-time detection systems • YOLOv2
state of the art and faster than other systems • 9K object category detection by YOLO9000 47
1. CVPR paper - https://pjreddie.com/media/files/papers/YOLO9000.pdf 2. Article - https://medium.com/@jonathan_hui/real-time-object-detection-with-yolo-yolov2-28b1b93e2088 3.
Author’s Presentation - https://docs.google.com/presentation/d/14qBAiyhMOFl_wZW4dA1CkixgXwf0zKGbpw_0oHK8yEM/edit#slide=id.g1f9fb98e4b_0 _132 References 48