Streaming Data Workhop @ Devoxx

Streaming Data Workhop @ Devoxx

Dealing with real-time, in-memory, streaming data is a unique challenge and with the advent of the smartphone and IoT (trillions of internet connected devices), we are witnessing an exponential growth in data at scale. Learning how to implement architectures that handle real-time streaming data, where data is flowing constantly, and combine it with analysis and instant search capabilities is key for developing robust and scalable services and applications.

In this lab session, we will look at how to implement an architecture like this, using reactive open source frameworks.
The streaming data architecture has the following tiers:
* Data collection tier
* Data transport tier
* Analysis tier
* In-Memory data store tier
* Data access tier
* Client tier

An architecture based around the Swiss rail transport system will be use throughout the lab.

Lab session technologies: Java (attendees must be comfortable with Java 8), Infinispan, Vert.x, OpenShift.

5438f857ad449f373323e64a763365c5?s=128

Galder Zamarreño

November 06, 2017
Tweet

Transcript

  1. Streaming Data Workshop

  2. Who are we ? Thomas @tsegismont Galder @galderz Katia @karesti

  3. Agenda • Install VM • Overview on Streaming Data Architecture

    • Micro introduction to ◦ Eclipse Vert.x ◦ Infinispan ◦ Openshift • Warmup • Workshop • Wrap-up
  4. VM installation - Mouse TOUCHPAD

  5. VM installation - Cores

  6. VM installation - Enable 3D

  7. VM installation - Guest Additions - Add Optical Drive

  8. Optional : Copy / Paste both ways

  9. Start the VM

  10. Virtual Box / Devices / Insert Guest Additions CD...

  11. Update workshop in VM cd streaming-data-workshop git fetch origin git

    pull
  12. Hosts - config for 100% offline /etc/hosts 127.0.0.1 datagrid-visualizer-myproject.127.0.0.1.nip.io 127.0.0.1

    simple-web-app-myproject.127.0.0.1.nip.io 127.0.0.1 delayed-listener-myproject.127.0.0.1.nip.io 127.0.0.1 delayed-trains-myproject.127.0.0.1.nip.io 127.0.0.1 workshop-main-myproject.127.0.0.1.nip.io
  13. In many scenarios the computation part of the system is

    operating in a non-hard real-time fashion, however, the clients may not be consuming the data in real-time, due to network delays, application design, or perhaps a client application is not even running. Put another way, what we really have is a non-hard real-time service with clients that consume data when they need it. This is a streaming data system, a non-hard real-time system that makes its data available at the moment a client application needs it, it is not soft or near, it is streaming. Streaming Data - Andrew G. Psaltis - Manning Publications. Streaming Data System - Definition
  14. Streaming Data Architecture - Layers Collection Tier Message Queueing Tier

    Analysis Tier In-Memory Data Store Data Access Tier Long Term Storage Browser, Device, Machine ... Browser, Device, Machine ...
  15. Streaming Data Architecture - Layers Collection Tier Message Queueing Tier

    Analysis Tier In-Memory Data Store Data Access Tier Long Term Storage Browser, Device, Machine ... Browser, Device, Machine ...
  16. Streaming Data Architecture - Layers Collection Tier Message Queueing Tier

    Analysis Tier In-Memory Data Store Data Access Tier Browser, Device, Machine ... Browser, Device, Machine ... *
  17. • Open-source project to create reactive applications and micro-services •

    Inspired from NodeJS ◦ Reactor pattern ◦ events • Unopinionated • Non blocking • Reactive • Scalable Eclipse Vert.x
  18. • Open-source project mainly maintained by Red Hat • 9.1

    Last Stable Release, version 9.2 in development • Cache / In-Memory Datagrid ◦ Local Cache ◦ HIbernate Cache ◦ Replicated Caches ◦ Distributed Caches ◦ Transactions ◦ Listeners ◦ Continuous Query ◦ Clustered Data Structures (Multimaps, Counters, Locks … and more coming) ◦ Client/Server Architecture - HotRod, Memcache, REST … ◦ … and more ... Infinispan
  19. • Container platform • Kubernetes plus extras • Build pipelines

    • Auto-Scaling • Service Catalog • Online and Hosted Openshift
  20. Warmup

  21. • Vert.x web application • Infinispan cluster • Deploy in

    openshift Ex 1: Your first application
  22. Demo

  23. Model - 2 real streams Train Train Position Stop Timed

    Position Station GeoLoc Timestamp Lat/Lng trainId delay name id 1..N
  24. • Microservices • Event bus Vert.x

  25. RxJava - Vert.x Rxfied API

  26. • Remote caches via HotRod Protocol • Continuous query •

    Ickle query Infinispan
  27. Hands On