Machine Learning. • Familiarity with numpy and Jupyter Notebooks. Recommended • Familiarity with TensorFlow. Helpful to have • Basics of Deep Learning and Convolutional Neural Networks (CNN).
use as input to a simple classification algorithm. Deep Learning models Use the images directly as input to a more complex classification algorithm. DATASET
an activation map. Source: https://github.com/vdumoulin/conv_arithmetic Use more filters to detect patterns over activation maps (patterns over patterns over patterns…)
→ We learn them. They are regular weights of the network (use backpropagation). How do we know how many filters in each layer? → Hyperparameter of the network (try and see what works best). Source: https://cs231n.github.io/understanding-cnn/
the Jingdong Zhongmei private hospital in Yanjiao, China's Hebei Province (AP Photo/Andy Wong) Hsieh et al., “Drone-based Object Counting by Spatially Regularized Regional Proposal Networks”, ICCV 2017. Source: Pinterest
stage prediction of object classes and bounding boxes. Examples: • You Only Look Once (YOLO, YOLOv2, YOLOv3) • Single Shot MultiBox Detector (SSD) Two stages: 1. Generate candidate locations using some algorithm. 2. Adjustment of bounding boxes and classification. Examples: • R-CNN, Fast R-CNN, Faster R-CNN
R-CNN - Girshick et al. 2015 Fast R-CNN - Girshick. 2016 Faster R-CNN Ren, Girshick et al. “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, CVPR 2016.
• Use it to propose interesting regions worth exploring. Associate an objectness score to them. • Classify regions. Discard those that are background (ie. keep good scores only) Learn how to further adjust for each class of object.
and its vicinity. 2. Predict 2 points (x1, y1), (x2, y2) for each location. Issues: • Can we make the network predict exact pixel coordinates? • Image dimensions are variable.
2. Define fixed-size reference box (called anchor). 3. Find “closest” GT box. 4. Predict the “objectness” of the region. 5. Learn how to modify the anchor (in relative terms, ie. “double its width”). 6. Repeat for every spatial position.
negative (red) Faster R-CNN Need positive (foreground) vs negative (background) anchors. Use Intersection over Union (IoU) with ground truth. Faster R-CNN: region proposal
Use Non-Maximum Suppression (NMS). • Keep top in “objectness” only. Classification Standard cross-entropy for 2 classes. Box regression Smooth L1 between difference of coordinates (positive anchors).
to get feature map. 2. Run feature map through RPN convolutional layers (3x3, 1x1 & 1x1) a. Obtain objectness and box regression scores for each anchor type and spatial position. b. Use regression scores to adjust each anchor. 3. Sort proposals by objectness score. 4. Apply NMS to remove redundant proposals. Result Set of proposals with associated objectness scores
crucial implementation details, such as shapes and types. • Comments have hints to help you. ◦ We can help you too, don’t be shy and ask! :D Priorities: 1. Make it work (whatever it takes!). 2. Implement it with vectorized numpy. 3. Implement it in pure TensorFlow. a. Can compile and run in GPU. b. You would have to do this for a real implementation.
Found 1 files to predict. Neither checkpoint not config specified, assuming `accurate`. Predicting video.mp4 [#############] 100% fps: 5.9 Building a toolkit
--help Usage: lumi [OPTIONS] COMMAND [ARGS]... Options: -h, --help Show this message and exit. Commands: checkpoint Groups of commands to manage checkpoints cloud Groups of commands to train models in the cloud dataset Groups of commands to manage datasets eval Evaluate trained (or training) models predict Obtain a model's predictions server Groups of commands to serve models train Train models
# Create tfrecords for optimizing data consumption. $ lumi train --config pascal-fasterrcnn.yml # Hours of training... $ tensorboard --logdir jobs/ # On another GPU/Machine/CPU. $ lumi eval --config pascal-fasterrcnn.yml # Checks for new checkpoints and writes logs. # Finally. $ lumi server web --config pascal-fasterrcnn.yml # Looks for checkpoint and loads it into a simple frontend/json API server. Building a toolkit