Building an Object Detection toolkit with TensorFlow (ODSC Europe 2017)

@ODSC OPEN DATA SCIENCE CONFERENCE London | October 12th -
14th 2017

Building an Object Detection toolkit with TensorFlow From academic papers
to open source implementation

| @tryolabs Who we are Javier Rey Lead Research Engineer
@vierja Alan Descoins CTO @dekked_ 3 Introduction

Then vs now

5 Introduction Felzenszwalb et. al., “Object Detection with Discriminatively Trained
Part Based Models”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 32, 2010. Detected objects in a sample image (from the COCO dataset) (2017). Source: Google Research Blog. sofa bottle sofa

Agenda 6 Introduction 6 6 Introduction Challenges and applications of
object detection Demystifying it: dive into Faster R-CNN Luminoth: our open-source toolkit for computer vision

Challenges of object detection 7 Introduction

Applications of object detection 8 Introduction CT scan of a
lung cancer patient at the Jingdong Zhongmei private hospital in Yanjiao, China's Hebei Province (AP Photo/Andy Wong) Hsieh et. al., “Drone-based Object Counting by Spatially Regularized Regional Proposal Networks”, ICCV 2017. Source: Pinterest

A hard problem with lots of applications 9 Make it
accessible! Build a toolkit! Introduction

Deep Learning & Object detection

Figure from https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/ Convolutional feature map Power of ConvNets as
feature extractors Pre-train: 11 Deep Learning & Object detection

Regression based methods Type of object detection models 12 Deep
Learning & Object detection Region proposal based methods Single stage prediction of object classes and bounding boxes. Examples: • You Only Look Once (YOLO) • Single Shot MultiBox Detector (SSD) Two stages: 1. Generate candidate locations using some algorithm. 2. Adjustment of bounding boxes and classification. Examples: • R-CNN, Fast R-CNN, Faster R-CNN

Faster R-CNN

Background Evolution of methods proposed in previous years: 14 Faster
R-CNN 2014 R-CNN - Girshick et. al. 2015 Fast R-CNN - Girshick. 2016 Faster R-CNN Ren, Girshick et. al. “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, CVPR 2016.

15 Faster R-CNN

16 Overview Faster R-CNN RoIP 3. Region of Interest (RoI)
Pooling R-CNN 4. Region-based CNN (R-CNN) RPN 2. Region Proposal Network (RPN) 1. Pre-trained base network

Faster R-CNN 1. Pre-trained base network

Pre-trained base network Image of arbitrary size → feature map.
Common architectures: • VGG (16, 19) • ResNet (v1, v2) • Inception (V2, V3) Feature map encodes information for object detection. 18 Faster R-CNN Feature map 50 37 600 800 CNN (VGG16) 3 512

Faster R-CNN 2. Region Proposal Network (RPN)

Region proposals 20 Image (feature map) → proposals: • variable
number. • different scales and aspect ratios. • efficient process • project bounding boxes to original image. Idea: start with reference boxes, later adjust. How many reference boxes? A lot! Faster R-CNN

Anchor boxes For each spatial position of the feature map,
generate k fixed anchors (with same center). 21 Faster R-CNN 3 scales, 3 aspect ratios (k=9)

Anchor centers in original image Anchors reference (9 anchors per
position) Visualizing anchor boxes (1) Anchors on top of single point 22 Faster R-CNN

All anchors superimposed Visualizing anchor boxes (2) Ground truth boxes
labels: person, bicycle 23 Faster R-CNN

Region Proposal Network (RPN) Feature map → rectangular proposals +
“objectness” score 24 Faster R-CNN RPN 3x3 conv (pad 1, 512 output channels) 1x1 conv (2k output channels) 1x1 conv (4k output channels) 2k objectness scores 4k box regression scores

All positive anchors IoU > 0.7 Anchors batch positive (green),
negative (purple) 25 Faster R-CNN RPN anchor targets Need positive (foreground) vs negative (background) anchors. Use Intersection over Union (IoU) with ground truth. Faster R-CNN

What’s missing 26 Faster R-CNN Multi-task loss Filtering of proposals
• Use Non-Maximum Suppression (NMS). • Keep top in “objectness” only. Classification Standard logarithmic loss for 2 classes. Box regression Smooth L1 between difference of coordinates (positive anchors).

27 Faster R-CNN

Faster R-CNN 3. Region of Interest (RoI) Pooling

RoI Pooling layer 29 Faster R-CNN Arbitrarily-sized proposals → fixed
spatial size • Can feed output to fully connected layers. • Very similar to max pooling. Faster R-CNN Project RoI Pool 7x7x512 Proposal RoI 512

Faster R-CNN 4. Region-based CNN (R-CNN)

Region-based CNN (R-CNN) 31 Faster R-CNN Fixed-size outputs of RoI
Pooling→ Faster R-CNN 7x7x512 probability distribution (N+1 classes) bounding box regressions (N classes) Flatten FC FC bicycle p=0.96 Softmax

32 Faster R-CNN person (0.99) bicycle (0.97)

Building a toolkit

Building a toolkit What is Luminoth? Open-source deep learning library/toolkit
for computer vision object detection. 34 CLI tools Pre-defined models Cloud integration

$ pip install luminoth $ lumi train # Magic The
goal 35 Building a toolkit

Building a toolkit Objectives 36 “Out-of-the-box” usage Production ready Open
source Readable code Extensible and modular

Building a toolkit TensorFlow + Sonnet import sonnet as snt
def RPN(snt.AbstractModule): def __init__(self, *args, name='rpn'): [...] # submodules init, config def _build(self, inputs): # TensorFlow code. return outputs 37 +

Building a toolkit “Model oriented programming” • Follow OOP good
practices Faster R-CNN RPN in: → feature map, anchors out: → proposals RCNN in: → proposals, pooled feature maps out: → objects, labels, probabilities 38

ObjectDetection 39 Hierarchical structure Building a toolkit RPN R-CNN RoIP
FasterRCNN RPN RPNTargets RPNProposals TFRecordDataset ObjectDetectionDataset RoIPooling RCNN RCNNTargets RCNNProposals TruncatedNetwork VGG/ResNet

Building a toolkit Challenges of coding from papers 40 Small
implementation details have no room in academic papers Papers tend to remain frozen in time Many ways to implement it

Building a toolkit Challenges of Faster R-CNN implementation 41 Multiple
moving parts Module dependencies Multi-task training

Data pipeline Debugging Training Data visualization Evaluation Deployment Beyond the
model Distributed 42 Building a toolkit Unit testing Monitoring Model

Building a toolkit https://github.com/tryolabs/luminoth Using Luminoth 43 $ pip install
luminoth $ lumi --help Usage: lumi [OPTIONS] COMMAND [ARGS]... Options: -h, --help Show this message and exit. Commands: cloud Groups of commands to train models in the cloud dataset Groups of commands to manage datasets evaluate Evaluate trained (or training) models server Groups of commands to serve models train Train models

$ lumi dataset transform --type pascal --data-dir /data/pascal --output /data/
# Create tfrecords for optimizing data consumption. $ lumi train --config pascal-fasterrcnn.yml # Hours of training $ tensorboard --logdir jobs/ # On another GPU/Machine/CPU $ lumi evaluate --config pascal-fasterrcnn.yml # Checks for new checkpoints and writes logs # Finally $ lumi server web --config pascal-fasterrcnn.yml # Looks for checkpoint and loads it into a simple frontend/json API server. Luminoth cycle 44 Building a toolkit

Building a toolkit Luminoth’s future 45 Fine-tune trained models More
models & problems Tagging ↔ Training integration Distributed deployment

Thanks for listening! Questions? Learn more & contribute github.com/tryolabs/luminoth

Building an Object Detection toolkit with Tenso...

Building an Object Detection toolkit with TensorFlow (ODSC Europe 2017)

More Decks by Tryolabs

Other Decks in Research

Featured

Transcript