Streaming Data Analysis with Kubernetes

Streaming Data Analysis with Kubernetes

Dealing with real-time, in-memory, streaming data is a unique challenge and with the advent of the smartphone and IoT (trillions of internet connected devices), we are witnessing an exponential growth in data at scale.

To be able to handle potential data growth, you want to process data in a cloud environment that can easily scale. In this space, Kubernetes offers great container orchestration and auto-scaling capabilities that are perfectly suited for streaming data use cases. When combined with Infinispan, an in-memory data grid from Red Hat, it empowers you with state of the art distributed data processing capabilities to tackle these challenges.

In this session, we will identify critical patterns and principles that will help you achieve greater scale and response speed. On top of that, you will witness how Infinispan follows these patterns and principles to tackle a Big Data situation via a live coding demonstration on top of a container platform orchestrated by Kubernetes.

5438f857ad449f373323e64a763365c5?s=128

Galder Zamarreño

April 24, 2018
Tweet

Transcript

  1. STREAMING DATA ANALYSIS WITH KUBERNETES Great Indian Developer Summit Galder

    Zamarreño Arrizabalaga
 @galderz
 26th April 2018
  2. @GALDERZ #INFINISPAN #GIDS18 2 Since 2006 ENGINEER @galderz Community Lead

    and Core Developer INFINISPAN CO-FOUNDER (2009) MUDKIP ROCKS!
  3. BUILD STREAMING DATA ANALYSIS APPLICATION ON TOP OF A KUBERNETES

    PLATFORM
  4. @GALDERZ #INFINISPAN #GIDS18 4 Delays can have a big impact

    EXPONENTIAL DATA GROWTH YEAR ON YEAR Smartphones, IOT devices, trillions of internet connected devices... REAL-TIME STREAMING DATA PROCESSING IS CHALLENGING THE PROBLEM
  5. @GALDERZ #INFINISPAN #GIDS18 5 ZZZ... NO!

  6. @GALDERZ #INFINISPAN #GIDS18 6 Platform-as-a-Service (PaaS) Platform for developing and

    running applications Public or private and multi-language OpenShift is a Kubernetes distro with extras THE PLATFORM
  7. @GALDERZ #INFINISPAN #GIDS18 7 IS

  8. @GALDERZ #INFINISPAN #GIDS18 8 Vert.x is a toolkit for building

    reactive apps On JVM, event-driven and non-blocking RxJava integrates with Vert.x Great at event transform and coordination Works best with many source of events (modern apps!) THE GLUE
  9. @GALDERZ #INFINISPAN #GIDS18 9 DATA GRID

  10. @GALDERZ #INFINISPAN #GIDS18 10 THE DATA http://transport.opendata.ch

  11. @GALDERZ #INFINISPAN #GIDS18 11 {"stop":{"station":{"id":"8500301","name":"Rheinfelden","score":null,"coordinate": {"type":"WGS84","x":47.55121,"y":7.792155}, "distance":null}, "arrival":null, "arrivalTimestamp":null, "departure":

    "2016-02-29T17:34:00+0100","departureTimestamp": 1456763640,"delay":3,"platform":"4","prognosis": {"platform":"4","arrival":null,"departure":"2016-02-29T17:37:00+0100","capacity1st": 1,"capacity2nd":1},"realtimeAvailability":null,"location":{"id":"8500301", "name":"Rheinfelden","score":null, "coordinate":{"type":"WGS84","x":47.55121,"y": 7.792155},"distance":null}},"name":"IR 1978" , "category":"IR", "categoryCode": 2,"number":"1978", "operator":"SBB","to":"Basel SBB", "capacity1st":null, "capacity2nd":null, "subcategory":"IR","timeStamp":1456761753983,"nextStation": {"station":{"id":"8500301", "name":"Rheinfelden", "score":null,"coordinate": {"type":"WGS84","x":47.55121,"y":7.792155}, "distance":null}, "arrival": "2016-02-29T17:34:00+0100","arrivalTimestamp"1456763640,"departure":null,"departureTimest amp":null,"delay":null,"platform":"","prognosis": {"platform":null,"arrival":null,"departure":null,"capacity1st":null,"capacity2nd":null}," realtimeAvailability":null,"location"{"id":"8500301","name":"Rheinfelden","score":null,"c oordinate":{"type":"WGS84","x":47.55121,"y":7.792155}, "distance":null}}, "@version":"1", "@timestamp":"2016-02-29T16:02:34.781Z"} SAMPLE DATA - OPENDATA
  12. @GALDERZ #INFINISPAN #JFOKUS 12 {"x":"8290840","y":"47483629","name":"IR 1978", "trainrefdate":"29.02.16","category":"IR","trainid":"84/25934/18/24/95","direction":"15", "prodclass":"4","delay":"7","passproc":"","lstopname":"Basel SBB","poly": [{"x":"8290840","y":"47483629","passproc":"",

    "msec":"0","direction":"15"}, {"x":"8290193","y":"47483647","passproc":"","msec":"2000","direction":"15"}, {"x":"8289528","y":"47483674","passproc":"","msec":"4000","direction":"15"}, {"x":"8288863","y":"47483701","passproc":"","msec":"6000","direction":"15"}, {"x":"8288198","y":"47483728","passproc":"","msec":"8000","direction":"15"}, {"x":"8287532","y":"47483755","passproc":"","msec":"10000","direction":"15"}, {"x":"8286885","y":"47483773","passproc":"",...],"timeStamp": 1456761728202,"@version":"1","@timestamp":"2016-02-29T16:02:11.321Z"} SAMPLE DATA - SBB
  13. @GALDERZ #INFINISPAN #JFOKUS 13 HIGH LEVEL ARCHITECTURE

  14. @GALDERZ #INFINISPAN #JFOKUS 14 COMPONENT ARCHITECTURE datagrid infinispan pod infinispan

    pod infinispan pod datagrid-hotrod service /eventbus/delayed-trains delayed trains /eventbus/delayed-positions app main http vert.x verticle pod station boards vert.x verticle train positions vert.x verticle delayed positions
  15. DEMO

  16. @GALDERZ #INFINISPAN #GIDS18 VERSATILITY OF INFINISPAN 16 Distributed Cache Shared

    Memory Event Broker Analytics
  17. BUILD STREAMING DATA ANALYSIS APPLICATION ON TOP OF A KUBERNETES

    PLATFORM
  18. THANK YOU! github.com/infinispan-demos/streaming-data-kubernetes infinispan.org redhat.com/en/technologies/jboss-middleware/data-grid openshift.com | vertx.io