Upgrade to Pro — share decks privately, control downloads, hide ads and more …

“Not Hotdog” Revisited

“Not Hotdog” Revisited

In this session, the speaker demonstrates the power of the Elixir ecosystem by replicating the hallowed meme “Not Hotdog” with conventional technology. The session covers the following tasks commonly seen in MLOps and application development in general.

Objectives:

1. To demonstrate the power of the Elixir ecosystem and the amount of engineering leverage available today

2. To promote the art of writing useful, inter-operable programs quickly based on the Unix philosophy.

3. To improve upon an ancient meme and bring joy to the community by creating shared experiences.

Scenarios:

- Perhaps one would like to integrate and run ML models after witnessing the power of Axon?

- Perhaps one would like to know how to architecture such services end-to-end while maintaining good operational hygiene?

Prerequisites:

- It is advised that the audience should have some prior experience in MLOps, however such experience is not required to enjoy the session.

- It is advised that the audience should have at least a rudimentary level of understanding of how Deep Learning models work (both training and deployment), having such experience will make the talk more enjoyable, however such experience is not required.

The talk will utilise a pre-trained model; due to the amount of time required, training will not be done during the session.

Outline:

- Defining the problem
- Reviewing the scope of the problem
- Arranging the flow of control and data channels
- Designing overall system topology
- Integrating custom ML models in an Elixir system
- Selecting and loading the Model
- [If using Axon.Serving] Converting the Model for Axon.Serving using axon_onnx
- Testing the Model in batched operation
- Building a script to submit still frames
- Adapting the Model for interactive operation
- [If using Axon.Serving] Building the backend node with Axon.Serving
- [If not] Building a custom C Node for high-performance interoperation
- Building the video pipeline with Membrane
- Creating the overall Pipeline with Membrane
- Extracting and converting raw video frames
- Creating the backend for request submission
- Implementing a rate limiter with backend back-pressure
- Providing interactivity with Phoenix LiveView
- Creating custom JavaScript hooks & UI elements
- Integrating WebRTC feedback into the viewport
- Overlaying on-screen annotations with Phoenix LiveView
- Review / Compare & Contrast
- Testing the Solution
- Audience Q&A

As presented at ElixirConf EU 2023 on 21 April 2023

Evadne Wu

April 20, 2023
Tweet

More Decks by Evadne Wu

Other Decks in Technology

Transcript

  1. Me Occasional Systems Engineer Creator of Etso, Packmatic, Shun, etc

    Spawnfest Participant & Judge Replicator of Memes
  2. 👁 “Not Hotdog” App 
 https://www.theverge.com/tldr/ 2017/5/14/15639784/hbo-silicon-valley-not- hotdog-app-download 👁 “Not

    Hotdog”— Google Play 
 https://play.google.com/store/apps/details? id=com.codylab.seefood&hl=en&gl=US
  3. 👁 “Not Hotdog” App 
 https://www.theverge.com/tldr/ 2017/5/14/15639784/hbo-silicon-valley-not- hotdog-app-download 👁 “Not

    Hotdog”— Google Play 
 https://play.google.com/store/apps/details? id=com.codylab.seefood&hl=en&gl=US
  4. 👁 “Not Hotdog” App 
 https://www.theverge.com/tldr/ 2017/5/14/15639784/hbo-silicon-valley-not- hotdog-app-download 👁 “Not

    Hotdog”— Google Play 
 https://play.google.com/store/apps/details? id=com.codylab.seefood&hl=en&gl=US
  5. 👁 “Not Hotdog” App 
 https://www.theverge.com/tldr/ 2017/5/14/15639784/hbo-silicon-valley-not- hotdog-app-download 👁 “Not

    Hotdog”— Google Play 
 https://play.google.com/store/apps/details? id=com.codylab.seefood&hl=en&gl=US
  6. Start Enumerate 
 Devices Enumerate 
 Devices Tight Constraint Loose

    Constraint OK Fail Enumerate 
 Devices GetUserMedia 🎥? 🎥? 🎥? 🎥?
  7. Tight Constraint { video: { width: { max: 1280, ideal:

    1280, min: 320 }, height: { max: 720, ideal: 720, min: 320 }, frameRate: { max: 30, ideal: 24 }, deviceId: { exact: videoDevice.deviceId }, facingMode: { exact: 'environment' } } }
  8. { video: { width: { max: 1280, ideal: 1280, min:

    320 }, height: { max: 720, ideal: 720, min: 320 }, frameRate: { max: 30, ideal: 24 }, deviceId: { exact: videoDevice.deviceId }, } } Loose Constraint
  9. Session LiveView Inference Session WebRTC Client (JS) WebRTC 
 Hook

    (JS) Browser 2 6 4 5 Stream 
 Acquired WebRTC 
 Hook Start 3 Inference Session 
 Start WebRTC 
 Endpoint 
 Start Interactive Connectivity Establishment 🎥 Stream Acquisition 1
  10. Inference Endpoint Track Receiver Inference Sink H.264 Parser H.264 Decoder

    RTP Packet RTP Packet 
 (Ordered) RTP Packet Depayloader Orientation Tracker H.264 Access Unit (Frame) %Membrane.RawVideo{} Orientation Data
  11. Inference Endpoint Track Receiver Inference Sink H.264 Parser H.264 Decoder

    RTP Packet RTP Packet 
 (Ordered) RTP Packet Depayloader Orientation Tracker H.264 Access Unit (Frame) %Membrane.RawVideo{} Orientation Data
  12. #7 #6 #5 #4 #3 #2 #1 #0 C F

    R1 R0 Coordination of Video Orientation (CVO) 
 2-bit granularity 
 3GPP TS 26.114 V13.3.0 (2016-03)
  13. R1 R0 Rotation 
 as Sent Rotation to Display 0

    0 0° 0° 0 1 270° CW 90° CW 1 0 180° CW 180° CW 1 1 90° CW 270° CW 👌
  14. Inference Endpoint Track Receiver Inference Sink H.264 Parser H.264 Decoder

    RTP Packet RTP Packet 
 (Ordered) RTP Packet Depayloader Orientation Tracker H.264 Access Unit (Frame) %Membrane.RawVideo{} Orientation Data
  15. 🌭 Inference Daemon Inference Broker Inference 
 Requestor Inference 


    Sink 📐 New Frame From Stream New Format From Stream Update Reply 1 1 2 3 Ask 7 4 5Request Reply 6 Reply (Go)
  16. 🌭 Inference Daemon Inference Broker Inference 
 Requestor Inference 


    Sink 📐 New Frame From Stream New Format From Stream Update Reply 1 1 2 3 Ask 7 4 5Request Reply 6 Reply (Go)
  17. Inference 
 Requestor If there is a pending frame, the

    requestor asks the Broker for another inference run once the current one completes
  18. 🌭 Inference Daemon Inference Broker Inference 
 Requestor Inference 


    Sink 📐 New Frame From Stream New Format From Stream Update Reply 1 1 2 3 Ask 7 4 5Request Reply 6 Reply (Go)
  19. Inference Daemon sbroker Inference 
 Requestor ask 
 (codel) ask_r

    
 (drop/∞) Inference 
 Requestor Inference 
 Requestor Inference 
 Requestor Inference 
 Requestor Inference Daemon Inference Daemon Inference Daemon Inference Daemon
  20. Inference Daemon sbroker Inference 
 (codel) ask_r 
 (drop/∞) Inference

    
 Requestor Overload: Too many Requestors Requestors dropped fairly via CoDel
  21. 🌭 Inference Daemon Inference Broker Inference 
 Requestor Inference 


    Sink 📐 New Frame From Stream New Format From Stream Update Reply 1 1 2 3 Ask 4
  22. 🌭 Inference Daemon Inference Broker Inference 
 Requestor Inference 


    Sink 📐 New Frame From Stream New Format From Stream Update Reply 1 1 2 3 Ask 4
  23. YOLOv5 (v6.0/6.1) consists of: 
 Backbone: New CSP-Darknet53 
 Neck:

    SPPF, New CSP-PAN 
 Head: YOLOv3 Head 📖 YOLOv5 Network Architecture 
 https://github.com/ultralytics/yolov5/issues/6998 📖 YOLO v5 model architecture [Explained] 
 https://iq.opengenus.org/yolov5/
  24. person fire hydrant elephant skis wine glass broccoli dining table

    toaster bicycle stop sign bear snowboard cup carrot toilet sink car parking meter zebra sports ball fork hot dog tv refrigerator motorcycle bench giraffe kite knife pizza laptop book airplane bird backpack baseball bat spoon donut mouse clock bus cat umbrella baseball glove bowl cake remote vase train dog handbag skateboard banana chair keyboard scissors truck horse tie surfboard apple couch cell phone teddy bear boat sheep suitcase tennis racket sandwich potted plant microwave hair drier traffic light cow frisbee bottle orange bed oven toothbrush
  25. person fire hydrant elephant skis wine glass broccoli dining table

    toaster bicycle stop sign bear snowboard cup carrot toilet sink car parking meter zebra sports ball fork hot dog tv refrigerator motorcycle bench giraffe kite knife pizza laptop book airplane bird backpack baseball bat spoon donut mouse clock bus cat umbrella baseball glove bowl cake remote vase train dog handbag skateboard banana chair keyboard scissors truck horse tie surfboard apple couch cell phone teddy bear boat sheep suitcase tennis racket sandwich potted plant microwave hair drier traffic light cow frisbee bottle orange bed oven toothbrush 👌
  26. @type detection :: { width :: non_neg_integer(), height :: non_neg_integer,

    class_name :: String.t(), class_id :: non_neg_integer(), score :: float() }
  27. @type detection :: { width :: non_neg_integer(), height :: non_neg_integer,

    class_name :: String.t(), class_id :: non_neg_integer(), score :: float() }
  28. <video> <svg> Session LiveView <svg> <div> <rect> <text> Provides the

    End User with visual feedback 
 once the detection process has started Provides positioning of overlay with the viewbox sized 
 to the video as projected on the browser’s viewport Provides positioning of detections within the 
 640 by 640 results grid Shows the message “Hotdog” or “Not Hotdog” 
 based on computed detections
  29. Bumblebee.Vision @type image() :: Nx.Container.t() > Nx.Tensor or struct implementing

    Nx.Container and resolving to a tensor > HWC order, RGB, optional Alpha > Integer type (:s or :u)
  30. As presented at ElixirConf EU 2023 “Not Hotdog” Revisited Special

    Thanks to: 
 Cocoa Xu / cocoa-xu Evision 
 
 Mateusz Front / mat-hek 
 Michał Śledź / mickel8 
 Membrane Presented By: 
 Evadne Wu / evadne / [email protected]