$30 off During Our Annual Pro Sale. View Details »

Streaming Data Analysis with Kubernetes

Streaming Data Analysis with Kubernetes

Dealing with real-time, in-memory, streaming data is a unique challenge and with the advent of the smartphone and IoT (trillions of internet connected devices), we are witnessing an exponential growth in data at scale. To be able to handle potential data growth, you want to process data in a cloud environment that can easily scale. In this space, Kubernetes offers great container orchestration and auto-scaling, and when combined with Infinispan, an in-memory data grid, it empowers you with state of the art distributed data processing capabilities to tackle these challenges. In this session, we will identify critical patterns and principles that will help you achieve greater scale and response speed, and you will see them in action with live coding examples.

Galder Zamarreño

February 06, 2018
Tweet

More Decks by Galder Zamarreño

Other Decks in Programming

Transcript

  1. STREAMING DATA ANALYSIS WITH
    KUBERNETES
    JFokus
    Galder Zamarreño Arrizabalaga

    @galderz

    6th February 2018

    View Slide

  2. @GALDERZ #INFINISPAN #JFOKUS
    2
    Since 2006
    ENGINEER
    @galderz
    Community Lead and
    Core Developer
    INFINISPAN
    CO-FOUNDER (2009) MUDKIP ROCKS!

    View Slide

  3. BUILD STREAMING DATA ANALYSIS
    APPLICATION ON TOP OF A
    KUBERNETES PLATFORM

    View Slide

  4. @GALDERZ #INFINISPAN #JFOKUS
    4
    Delays can have a big impact
    EXPONENTIAL DATA GROWTH
    YEAR ON YEAR
    Smartphones, IOT devices, trillions of internet
    connected devices...
    REAL-TIME STREAMING DATA
    PROCESSING IS CHALLENGING
    THE PROBLEM

    View Slide

  5. @GALDERZ #INFINISPAN #JFOKUS
    5
    ZZZ... NO!

    View Slide

  6. @GALDERZ #INFINISPAN #JFOKUS
    6
    Platform-as-a-Service (PaaS)
    Platform for developing and running
    applications
    Public or private and multi-language
    OpenShift is a Kubernetes distro with extras
    THE PLATFORM

    View Slide

  7. @GALDERZ #INFINISPAN #JFOKUS
    7
    IS

    View Slide

  8. @GALDERZ #INFINISPAN #JFOKUS
    8
    Timber!
    Provisions and manages instances where
    OpenShift will run
    GENERAL CONTEXT
    Public-only platform for running, managing and
    scaling applications in the cloud
    DEMO CONTEXT
    THE CLOUD

    View Slide

  9. @GALDERZ #INFINISPAN #JFOKUS
    9
    Vert.x is a toolkit for building reactive apps
    On JVM, event-driven and non-blocking
    RxJava integrates with Vert.x
    Great at event transform and coordination
    Works best with many source of events (modern apps!)
    THE GLUE

    View Slide

  10. @GALDERZ #INFINISPAN #JFOKUS
    10
    DATA GRID

    View Slide

  11. @GALDERZ #INFINISPAN #JFOKUS
    11
    THE DATA
    http://transport.opendata.ch

    View Slide

  12. @GALDERZ #INFINISPAN #JFOKUS
    12
    {"stop":{"station":{"id":"8500301","name":"Rheinfelden","score":null,"coordinate":
    {"type":"WGS84","x":47.55121,"y":7.792155}, "distance":null}, "arrival":null,
    "arrivalTimestamp":null, "departure": "2016-02-29T17:34:00+0100","departureTimestamp":
    1456763640,"delay":3,"platform":"4","prognosis":
    {"platform":"4","arrival":null,"departure":"2016-02-29T17:37:00+0100","capacity1st":
    1,"capacity2nd":1},"realtimeAvailability":null,"location":{"id":"8500301",
    "name":"Rheinfelden","score":null, "coordinate":{"type":"WGS84","x":47.55121,"y":
    7.792155},"distance":null}},"name":"IR 1978" , "category":"IR", "categoryCode":
    2,"number":"1978", "operator":"SBB","to":"Basel SBB", "capacity1st":null,
    "capacity2nd":null, "subcategory":"IR","timeStamp":1456761753983,"nextStation":
    {"station":{"id":"8500301", "name":"Rheinfelden", "score":null,"coordinate":
    {"type":"WGS84","x":47.55121,"y":7.792155}, "distance":null}, "arrival":
    "2016-02-29T17:34:00+0100","arrivalTimestamp"1456763640,"departure":null,"departureTimest
    amp":null,"delay":null,"platform":"","prognosis":
    {"platform":null,"arrival":null,"departure":null,"capacity1st":null,"capacity2nd":null},"
    realtimeAvailability":null,"location"{"id":"8500301","name":"Rheinfelden","score":null,"c
    oordinate":{"type":"WGS84","x":47.55121,"y":7.792155}, "distance":null}}, "@version":"1",
    "@timestamp":"2016-02-29T16:02:34.781Z"}
    SAMPLE DATA

    View Slide

  13. @GALDERZ #INFINISPAN #JFOKUS
    13
    ARQUITECTURA
    Data Grid
    Replication
    Delay Calculator
    Server Task
    Delay Calculator
    Server Task
    Delay Calculator
    Server Task
    Analytics
    Verticle
    Injector
    Verticle
    Analytics
    Jupyter
    Laptop
    HTTP
    Continuous Query
    Verticle
    Http App
    Verticle
    Data Grid
    Replication
    Sock JS Bridge
    Real Time Laptop
    Http
    Websockets
    JavaFX
    Injector
    Verticle

    View Slide

  14. DEMO

    View Slide

  15. @GALDERZ #INFINISPAN #JFOKUS
    VERSATILITY OF INFINISPAN
    15
    Distributed Cache Shared Memory Event Broker Analytics

    View Slide

  16. BUILD STREAMING DATA ANALYSIS
    APPLICATION ON TOP OF A
    KUBERNETES PLATFORM

    View Slide

  17. THANK YOU!
    github.com/infinispan-demos/streaming-data-kubernetes
    infinispan.org
    redhat.com/en/technologies/jboss-middleware/data-grid
    openshift.com | vertx.io

    View Slide