Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building intelligent, real-time applications using Machine Learning

Jayesh
November 24, 2017

Building intelligent, real-time applications using Machine Learning

Discuss the current-state-of-affairs for deploying Machine Learning models
Discuss shortcomings of this approach
Discuss the value of streaming data
Brief introduction to Apache Kafka and Streaming applications
Discuss how to use Apache Kafka to use ML models in real-time
Demonstrate how we use a Demography Prediction model in real-time

Jayesh

November 24, 2017
Tweet

More Decks by Jayesh

Other Decks in Technology

Transcript

  1. Scalable, real-time Machine Learning using
    Apache Kafka

    View Slide

  2. Agenda
    ● Traditional model deployment process
    ● 90 seconds to WoW
    ● Let’s process the incoming stream
    ● Demo
    ● What’s more?
    2

    View Slide

  3. $ whoami
    ● Personalisation lead at Hotstar
    ● Led Data Infrastructure team at Grofers and TinyOwl
    ● Kafka fanboy
    ● Usually rant on twitter @jayeshsidhwani
    3

    View Slide

  4. Machine Learning @ Hotstar
    ● ~150 mn users
    ● 4.8 mn peak concurrency
    ● 120K peak recommendation requests per
    second
    ● Diverse content in diverse languages
    4

    View Slide

  5. Traditional model deployment process
    5
    Model
    Training
    Data Lake
    Serialized
    Model
    Batch
    Predictions
    Recommendation
    APIs
    Offline Online ● One-day /
    few-hours batch
    pre-compute
    ● Slow time to
    react

    View Slide

  6. Sense of urgency?
    6
    ● 90 seconds to convert a new user
    ● To power his experience, we need to know
    user’s gender, interests and more
    ● Need an always-thinking machine

    View Slide

  7. Thinking streams
    7
    Data at Rest Data in motion
    ● Slow
    ● Batch-y
    ● Fast
    ● Sub-second

    View Slide

  8. Enter Apache Kafka
    8
    ● Kafka is a scalable,
    fault-tolerant, distributed message
    queue
    ● Producers and Consumers
    ● Uses
    ○ Real-time applications such as:
    intelligent notifications, anomaly etc.
    ○ Asynchronous communication in
    event-driven architectures
    Diagram credits: http://kafka.apache.org

    View Slide

  9. Real-time infrastructure at Hotstar
    9
    ● All clickstream data pushed
    into Apache Kafka
    ● Apache Kafka Streams to
    process events as they happen
    ● Incoming data available for
    everyone
    Intelligence
    Apple
    TV
    iOS ANDROID Roku
    STREAM PROCESSING FRAMEWORK
    Filter
    Window
    Join
    Anomaly
    Machine
    Learning

    View Slide

  10. Demo
    Predict whether a flight is delayed in real-time
    10

    View Slide

  11. How to process a stream?
    11
    ML

    View Slide

  12. Advanced use-cases
    12
    page-clicks
    Processor nodes
    Source / Sink nodes
    video-plays
    predict-gender
    predict-interest 5-min trending
    videos
    Recommended
    for You
    Hotstar Streaming Platform

    View Slide

  13. Questions?
    13
    tech.hotstar.com

    View Slide