Building an Object Detection toolkit with TensorFlow (PyLadies Meetup)

Building an Object Detection toolkit with TensorFlow From academic papers
to open source implementation

| @tryolabs Who we are Alan Descoins CTO @dekked_ 2
Introduction

Then vs now

4 Introduction Felzenszwalb et al., “Object Detection with Discriminatively Trained
Part Based Models”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 2010. Detected objects in a sample image (from the COCO dataset) (2017). Source: Google Research Blog. sofa bottle sofa

Agenda 5 Introduction 5 5 Introduction Challenges and applications of
object detection Demystifying it: dive into Faster R-CNN Luminoth: our open-source toolkit for computer vision

Challenges of object detection 6 Introduction

Applications of object detection 7 Introduction CT scan of a
lung cancer patient at the Jingdong Zhongmei private hospital in Yanjiao, China's Hebei Province (AP Photo/Andy Wong) Hsieh et al., “Drone-based Object Counting by Spatially Regularized Regional Proposal Networks”, ICCV 2017. Source: Pinterest

A hard problem with lots of applications 8 Make it
accessible! Build a toolkit! Introduction

Deep Learning & Object detection

Figure from https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/ Convolutional feature map Power of ConvNets as
feature extractors Pre-train: 10 Deep Learning & Object detection

Regression based methods Type of object detection models 11 Deep
Learning & Object detection Region proposal based methods Single stage prediction of object classes and bounding boxes. Examples: • You Only Look Once (YOLO) • Single Shot MultiBox Detector (SSD) Two stages: 1. Generate candidate locations using some algorithm. 2. Adjustment of bounding boxes and classification. Examples: • R-CNN, Fast R-CNN, Faster R-CNN

Faster R-CNN

Background Evolution of methods proposed in previous years: 13 Faster
R-CNN 2014 R-CNN - Girshick et al. 2015 Fast R-CNN - Girshick. 2016 Faster R-CNN Ren, Girshick et al. “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, CVPR 2016.

14 Faster R-CNN

15 Overview Faster R-CNN RoIP 3. Region of Interest (RoI)
Pooling R-CNN 4. Region-based CNN (R-CNN) RPN 2. Region Proposal Network (RPN) 1. Pre-trained base network

Faster R-CNN 1. Pre-trained base network

Pre-trained base network Image of arbitrary size → feature map.
Common architectures: • VGG (16, 19) • ResNet (v1, v2) • Inception (V2, V3) Feature map encodes information for object detection. 17 Faster R-CNN Feature map 50 37 600 800 CNN (VGG16) 3 512

Faster R-CNN 2. Region Proposal Network (RPN)

Region proposals 19 Image (feature map) → proposals: • variable
number. • different scales and aspect ratios. • efficient process • project bounding boxes to original image. Idea: start with reference boxes, later adjust. How many reference boxes? A lot! Faster R-CNN

Anchor boxes For each spatial position of the feature map,
generate k fixed anchors (with same center). 20 Faster R-CNN 3 scales, 3 aspect ratios (k=9)

Anchor centers in original image Anchors reference (9 anchors per
position) Visualizing anchor boxes (1) Anchors on top of single point 21 Faster R-CNN

All anchors superimposed Visualizing anchor boxes (2) Ground truth boxes
labels: person, bicycle 22 Faster R-CNN

Region Proposal Network (RPN) Feature map → rectangular proposals +
“objectness” score 23 Faster R-CNN RPN 3x3 conv (pad 1, 512 output channels) 1x1 conv (2k output channels) 1x1 conv (4k output channels) 2k objectness scores 4k box regression scores

All positive anchors IoU > 0.7 Anchors batch positive (green),
negative (red) 24 Faster R-CNN RPN anchor targets Need positive (foreground) vs negative (background) anchors. Use Intersection over Union (IoU) with ground truth. Faster R-CNN

What’s missing 25 Faster R-CNN Multi-task loss Filtering of proposals
• Use Non-Maximum Suppression (NMS). • Keep top in “objectness” only. Classification Standard logarithmic loss for 2 classes. Box regression Smooth L1 between difference of coordinates (positive anchors).

26 Faster R-CNN

Faster R-CNN 3. Region of Interest (RoI) Pooling

RoI Pooling layer 28 Faster R-CNN Arbitrarily-sized proposals → fixed
spatial size • Can feed output to fully connected layers. • Very similar to max pooling. Faster R-CNN Project RoI Pool 7x7x512 Proposal RoI 512

Faster R-CNN 4. Region-based CNN (R-CNN)

Region-based CNN (R-CNN) 30 Faster R-CNN Fixed-size outputs of RoI
Pooling→ Faster R-CNN 7x7x512 probability distribution (N+1 classes) bounding box regressions (N classes) Flatten FC FC bicycle p=0.96 Softmax

31 Faster R-CNN person (0.99) bicycle (0.97)

Building a toolkit

Building a toolkit What is Luminoth? Open-source deep learning library/toolkit
for computer vision object detection. 33 CLI tools Pre-defined models Cloud integration

$ pip install luminoth $ lumi train # Magic The
goal 34 Building a toolkit

Building a toolkit Objectives 35 “Out-of-the-box” usage Production ready Open
source Readable code Extensible and modular

Building a toolkit TensorFlow + Sonnet import sonnet as snt
def RPN(snt.AbstractModule): def __init__(self, *args, name='rpn'): [...] # submodules init, config def _build(self, inputs): # TensorFlow code. return outputs 36 +

Building a toolkit “Model oriented programming” • Follow OOP good
practices Faster R-CNN RPN in: → feature map, anchors out: → proposals RCNN in: → proposals, pooled feature maps out: → objects, labels, probabilities 37

ObjectDetection 38 Hierarchical structure Building a toolkit RPN R-CNN RoIP
FasterRCNN RPN RPNTargets RPNProposals TFRecordDataset ObjectDetectionDataset RoIPooling RCNN RCNNTargets RCNNProposals TruncatedNetwork VGG/ResNet

Building a toolkit Challenges of coding from papers 39 Small
implementation details have no room in academic papers Papers tend to remain frozen in time Many ways to implement it

Building a toolkit Challenges of Faster R-CNN implementation 40 Multiple
moving parts Module dependencies Multi-task training

Data pipeline Debugging Training Data visualization Evaluation Deployment Beyond the
model Distributed 41 Building a toolkit Unit testing Monitoring Model

Building a toolkit https://github.com/tryolabs/luminoth Using Luminoth 42 $ pip install
luminoth $ lumi --help Usage: lumi [OPTIONS] COMMAND [ARGS]... Options: -h, --help Show this message and exit. Commands: cloud Groups of commands to train models in the cloud dataset Groups of commands to manage datasets evaluate Evaluate trained (or training) models server Groups of commands to serve models train Train models

$ lumi dataset transform --type pascal --data-dir /data/pascal --output /data/
# Create tfrecords for optimizing data consumption. $ lumi train --config pascal-fasterrcnn.yml # Hours of training $ tensorboard --logdir jobs/ # On another GPU/Machine/CPU $ lumi evaluate --config pascal-fasterrcnn.yml # Checks for new checkpoints and writes logs # Finally $ lumi server web --config pascal-fasterrcnn.yml # Looks for checkpoint and loads it into a simple frontend/json API server. Luminoth cycle 43 Building a toolkit

Building a toolkit Luminoth’s future 44 Fine-tune trained models More
models & problems Tagging ↔ Training integration Distributed deployment

Thanks for listening! Questions? Learn more & contribute github.com/tryolabs/luminoth

Building an Object Detection toolkit with Tenso...

Building an Object Detection toolkit with TensorFlow (PyLadies Meetup)

More Decks by Tryolabs

Other Decks in Research

Featured

Transcript