Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Deep Learning based object Detection with YOLO v2
Search
Jumabek Alikhanov
September 29, 2019
Research
280
1
Share
Deep Learning based object Detection with YOLO v2
I will briefly go through the the process of YOLOv2
Jumabek Alikhanov
September 29, 2019
Other Decks in Research
See All in Research
それ、チームの改善になってますか?ー「チームとは?」から始めた組織の実験ー
hirakawa51
0
1.2k
論文紹介 "ReSim: Reliable World Simulation for Autonomous Driving"
kogo
0
590
COFFEE-Japan PROJECT Impact Report(海ノ向こうコーヒー)
ontheslope
0
1.7k
データセンター事業者を取り巻く近年の状況とその中での研究開発動向、テストベッドへの貢献の可能性
kikuzo
1
130
重要だけど測れていないもの:高齢者ケアの見えない課題
theoriatec2024
0
290
Any-Optical-Model: A Universal Foundation Model for Optical Remote Sensing
satai
3
770
Can We Teach Logical Reasoning to LLMs? – An Approach Using Synthetic Corpora (AAAI 2026 bridge keynote)
morishtr
1
240
台湾モデルに学ぶ詐欺広告対策:市民参加の必要性
dd2030
0
330
COFFEE-Japan PROJECT Impact Report(Uminomukou Coffee)
ontheslope
0
150
LLM の Attention 機構まとめ — 数式・計算量・メモリ
puwaer
7
1.9k
R&Dチームを起ち上げる
shibuiwilliam
1
260
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
shunk031
4
950
Featured
See All Featured
Balancing Empowerment & Direction
lara
6
1.1k
Leadership Guide Workshop - DevTernity 2021
reverentgeek
1
290
State of Search Keynote: SEO is Dead Long Live SEO
ryanjones
0
200
Designing Powerful Visuals for Engaging Learning
tmiket
1
380
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
250
SEO Brein meetup: CTRL+C is not how to scale international SEO
lindahogenes
1
2.7k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
2
370
The Mindset for Success: Future Career Progression
greggifford
PRO
0
340
We Are The Robots
honzajavorek
0
230
[RailsConf 2023] Rails as a piece of cake
palkan
59
6.6k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
55k
Transcript
Jumabek Alikhanov @Information Security Research Lab, Inha University YOLO9000: Better,
Faster, Stronger (CVPR 2017, Best Paper Honorable Mention) 1
1. Introduction & Previous Work 2. Better detection performance 3.
Faster processing speed 4. Detecting more classes(object types) 5. Conclusion CONTENTS 2
Task & Evaluation Metric mAP- mean Avarage Precision 3 https://github.com/rafaelpadilla/Object-Detection-Metrics
YOLO v1 Network Output shape = (S, S, B×5 +
C) = (7, 7, 2×5 + 20) = (7, 7, 30). 4
YOLOv1: Loss Function pi-conditional class Prob. Ci - box confidence
score 5 Localization Confidence Classification
Previously Pascal 2007 mAP Speed DPM v5 33.7 .07 FPS
14 s/img R-CNN 66.0 .05 FPS 20 s/img Fast R-CNN 70.0 .5 FPS 2 s/img Faster R-CNN 73.2 7 FPS 140 ms/img YOLO 63.4 45 FPS 22 ms/img 6
Previously Pascal 2007 mAP Speed DPM v5 33.7 .07 FPS
14 s/img R-CNN 66.0 .05 FPS 20 s/img Fast R-CNN 70.0 .5 FPS 2 s/img Faster R-CNN 73.2 7 FPS 140 ms/img YOLO 63.4 45 FPS 22 ms/img 7
Better Performance 8
9 YOLO Train on ImageNet Fine-tune on detection Resize network
10 Fine-tune 448x448 Classifier: +3.5% mAP Train on ImageNet Fine-tune
on detection Resize, fine-tune on ImageNet
Anchor boxes use static initialization
Use k-means clustering to find better initializations https://github.com/Jumabek/darknet_scripts
None
Static Anchors vs Dimension Clusters 14
Box Location Prediction 15
Dimension Clusters: +5% mAP
17 Multi-scale training: +1.5% mAP
YOLOv2: Fast, Accurate Detection
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object
detectors." arXiv preprint arXiv:1611.10012 (2016).
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object
detectors." arXiv preprint arXiv:1611.10012 (2016).
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object
detectors." arXiv preprint arXiv:1611.10012 (2016). YOLOv2
None
Faster Detection Speed 23
Speed is not just parameter counts or FLOPs Top 1
Top 5 FLOPs GPU Speed VGG-16 70.5 90.0 30.95 Bn 100 FPS Extraction (YOLOv1) 72.5 90.8 8.52 Bn 180 FPS Resnet50 75.3 92.2 7.66 Bn 90 FPS
Darknet19: A good balance of speed and accuracy Top 1
Top 5 FLOPs GPU Speed VGG-16 70.5 90.0 30.95 Bn 100 FPS Extraction (YOLOv1) 72.5 90.8 8.52 Bn 180 FPS Resnet50 75.3 92.2 7.66 Bn 90 FPS Darknet19 74.0 91.8 5.58 Bn 200 FPS
Why is it fast? Simple & efficient architecture C implementation
26
Stronger - Detecting more classes 27
- 14 million images - 22k classes - Classification labels
- 100k images - 80 classes - Detection labels Golden eagle
Typically use softmax over all classes
Can’t just mash classes together...
Can’t just mash classes together...
WordNet has structure but it’s messy
None
None
... Each node is a conditional probability
... Each node is a conditional probability P(Bedlington terrier) =
P(object) * P (living thing | object) * ….. P(canine | mammal) * P(dog | canine) * P(terrier | dog) * P(Bedlington terrier | terrier)
None
None
None
None
None
None
None
None
None
None
Conclusion • YOLOv2 and YOLO9000 real-time detection systems • YOLOv2
state of the art and faster than other systems • 9K object category detection by YOLO9000 47
1. CVPR paper - https://pjreddie.com/media/files/papers/YOLO9000.pdf 2. Article - https://medium.com/@jonathan_hui/real-time-object-detection-with-yolo-yolov2-28b1b93e2088 3.
Author’s Presentation - https://docs.google.com/presentation/d/14qBAiyhMOFl_wZW4dA1CkixgXwf0zKGbpw_0oHK8yEM/edit#slide=id.g1f9fb98e4b_0 _132 References 48