$30 off During Our Annual Pro Sale. View Details »

Streaming Data Analysis with Kubernetes

Streaming Data Analysis with Kubernetes

Dealing with real-time, in-memory, streaming data is a unique challenge and with the advent of the smartphone and IoT (trillions of internet connected devices), we are witnessing an exponential growth in data at scale.

To be able to handle potential data growth, you want to process data in a cloud environment that can easily scale. In this space, Kubernetes offers great container orchestration and auto-scaling capabilities that are perfectly suited for streaming data use cases. When combined with Infinispan, an in-memory data grid from Red Hat, it empowers you with state of the art distributed data processing capabilities to tackle these challenges.

In this session, we will identify critical patterns and principles that will help you achieve greater scale and response speed. On top of that, you will witness how Infinispan follows these patterns and principles to tackle a Big Data situation via a live coding demonstration on top of a container platform orchestrated by Kubernetes.

Galder Zamarreño

April 24, 2018
Tweet

More Decks by Galder Zamarreño

Other Decks in Programming

Transcript

  1. STREAMING DATA ANALYSIS WITH
    KUBERNETES
    Great Indian Developer Summit
    Galder Zamarreño Arrizabalaga

    @galderz

    26th April 2018

    View Slide

  2. @GALDERZ #INFINISPAN #GIDS18
    2
    Since 2006
    ENGINEER
    @galderz
    Community Lead and
    Core Developer
    INFINISPAN
    CO-FOUNDER (2009) MUDKIP ROCKS!

    View Slide

  3. BUILD STREAMING DATA ANALYSIS
    APPLICATION ON TOP OF A
    KUBERNETES PLATFORM

    View Slide

  4. @GALDERZ #INFINISPAN #GIDS18
    4
    Delays can have a big impact
    EXPONENTIAL DATA GROWTH
    YEAR ON YEAR
    Smartphones, IOT devices, trillions of internet
    connected devices...
    REAL-TIME STREAMING DATA
    PROCESSING IS CHALLENGING
    THE PROBLEM

    View Slide

  5. @GALDERZ #INFINISPAN #GIDS18
    5
    ZZZ... NO!

    View Slide

  6. @GALDERZ #INFINISPAN #GIDS18
    6
    Platform-as-a-Service (PaaS)
    Platform for developing and running
    applications
    Public or private and multi-language
    OpenShift is a Kubernetes distro with extras
    THE PLATFORM

    View Slide

  7. @GALDERZ #INFINISPAN #GIDS18
    7
    IS

    View Slide

  8. @GALDERZ #INFINISPAN #GIDS18
    8
    Vert.x is a toolkit for building reactive apps
    On JVM, event-driven and non-blocking
    RxJava integrates with Vert.x
    Great at event transform and coordination
    Works best with many source of events (modern apps!)
    THE GLUE

    View Slide

  9. @GALDERZ #INFINISPAN #GIDS18
    9
    DATA GRID

    View Slide

  10. @GALDERZ #INFINISPAN #GIDS18
    10
    THE DATA
    http://transport.opendata.ch

    View Slide

  11. @GALDERZ #INFINISPAN #GIDS18
    11
    {"stop":{"station":{"id":"8500301","name":"Rheinfelden","score":null,"coordinate":
    {"type":"WGS84","x":47.55121,"y":7.792155}, "distance":null}, "arrival":null,
    "arrivalTimestamp":null, "departure": "2016-02-29T17:34:00+0100","departureTimestamp":
    1456763640,"delay":3,"platform":"4","prognosis":
    {"platform":"4","arrival":null,"departure":"2016-02-29T17:37:00+0100","capacity1st":
    1,"capacity2nd":1},"realtimeAvailability":null,"location":{"id":"8500301",
    "name":"Rheinfelden","score":null, "coordinate":{"type":"WGS84","x":47.55121,"y":
    7.792155},"distance":null}},"name":"IR 1978" , "category":"IR", "categoryCode":
    2,"number":"1978", "operator":"SBB","to":"Basel SBB", "capacity1st":null,
    "capacity2nd":null, "subcategory":"IR","timeStamp":1456761753983,"nextStation":
    {"station":{"id":"8500301", "name":"Rheinfelden", "score":null,"coordinate":
    {"type":"WGS84","x":47.55121,"y":7.792155}, "distance":null}, "arrival":
    "2016-02-29T17:34:00+0100","arrivalTimestamp"1456763640,"departure":null,"departureTimest
    amp":null,"delay":null,"platform":"","prognosis":
    {"platform":null,"arrival":null,"departure":null,"capacity1st":null,"capacity2nd":null},"
    realtimeAvailability":null,"location"{"id":"8500301","name":"Rheinfelden","score":null,"c
    oordinate":{"type":"WGS84","x":47.55121,"y":7.792155}, "distance":null}}, "@version":"1",
    "@timestamp":"2016-02-29T16:02:34.781Z"}
    SAMPLE DATA - OPENDATA

    View Slide

  12. @GALDERZ #INFINISPAN #JFOKUS
    12
    {"x":"8290840","y":"47483629","name":"IR 1978",
    "trainrefdate":"29.02.16","category":"IR","trainid":"84/25934/18/24/95","direction":"15",
    "prodclass":"4","delay":"7","passproc":"","lstopname":"Basel SBB","poly":
    [{"x":"8290840","y":"47483629","passproc":"", "msec":"0","direction":"15"},
    {"x":"8290193","y":"47483647","passproc":"","msec":"2000","direction":"15"},
    {"x":"8289528","y":"47483674","passproc":"","msec":"4000","direction":"15"},
    {"x":"8288863","y":"47483701","passproc":"","msec":"6000","direction":"15"},
    {"x":"8288198","y":"47483728","passproc":"","msec":"8000","direction":"15"},
    {"x":"8287532","y":"47483755","passproc":"","msec":"10000","direction":"15"},
    {"x":"8286885","y":"47483773","passproc":"",...],"timeStamp":
    1456761728202,"@version":"1","@timestamp":"2016-02-29T16:02:11.321Z"}
    SAMPLE DATA - SBB

    View Slide

  13. @GALDERZ #INFINISPAN #JFOKUS
    13
    HIGH LEVEL ARCHITECTURE

    View Slide

  14. @GALDERZ #INFINISPAN #JFOKUS
    14
    COMPONENT ARCHITECTURE
    datagrid
    infinispan
    pod
    infinispan
    pod
    infinispan
    pod
    datagrid-hotrod
    service
    /eventbus/delayed-trains
    delayed trains
    /eventbus/delayed-positions
    app
    main http
    vert.x verticle pod
    station boards
    vert.x verticle
    train positions
    vert.x verticle
    delayed positions

    View Slide

  15. DEMO

    View Slide

  16. @GALDERZ #INFINISPAN #GIDS18
    VERSATILITY OF INFINISPAN
    16
    Distributed Cache Shared Memory Event Broker Analytics

    View Slide

  17. BUILD STREAMING DATA ANALYSIS
    APPLICATION ON TOP OF A
    KUBERNETES PLATFORM

    View Slide

  18. THANK YOU!
    github.com/infinispan-demos/streaming-data-kubernetes
    infinispan.org
    redhat.com/en/technologies/jboss-middleware/data-grid
    openshift.com | vertx.io

    View Slide