Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Streaming Data Workhop @ Devoxx

Streaming Data Workhop @ Devoxx

Dealing with real-time, in-memory, streaming data is a unique challenge and with the advent of the smartphone and IoT (trillions of internet connected devices), we are witnessing an exponential growth in data at scale. Learning how to implement architectures that handle real-time streaming data, where data is flowing constantly, and combine it with analysis and instant search capabilities is key for developing robust and scalable services and applications.

In this lab session, we will look at how to implement an architecture like this, using reactive open source frameworks.
The streaming data architecture has the following tiers:
* Data collection tier
* Data transport tier
* Analysis tier
* In-Memory data store tier
* Data access tier
* Client tier

An architecture based around the Swiss rail transport system will be use throughout the lab.

Lab session technologies: Java (attendees must be comfortable with Java 8), Infinispan, Vert.x, OpenShift.

Galder Zamarreño

November 06, 2017
Tweet

More Decks by Galder Zamarreño

Other Decks in Programming

Transcript

  1. Agenda • Install VM • Overview on Streaming Data Architecture

    • Micro introduction to ◦ Eclipse Vert.x ◦ Infinispan ◦ Openshift • Warmup • Workshop • Wrap-up
  2. Hosts - config for 100% offline /etc/hosts 127.0.0.1 datagrid-visualizer-myproject.127.0.0.1.nip.io 127.0.0.1

    simple-web-app-myproject.127.0.0.1.nip.io 127.0.0.1 delayed-listener-myproject.127.0.0.1.nip.io 127.0.0.1 delayed-trains-myproject.127.0.0.1.nip.io 127.0.0.1 workshop-main-myproject.127.0.0.1.nip.io
  3. In many scenarios the computation part of the system is

    operating in a non-hard real-time fashion, however, the clients may not be consuming the data in real-time, due to network delays, application design, or perhaps a client application is not even running. Put another way, what we really have is a non-hard real-time service with clients that consume data when they need it. This is a streaming data system, a non-hard real-time system that makes its data available at the moment a client application needs it, it is not soft or near, it is streaming. Streaming Data - Andrew G. Psaltis - Manning Publications. Streaming Data System - Definition
  4. Streaming Data Architecture - Layers Collection Tier Message Queueing Tier

    Analysis Tier In-Memory Data Store Data Access Tier Long Term Storage Browser, Device, Machine ... Browser, Device, Machine ...
  5. Streaming Data Architecture - Layers Collection Tier Message Queueing Tier

    Analysis Tier In-Memory Data Store Data Access Tier Long Term Storage Browser, Device, Machine ... Browser, Device, Machine ...
  6. Streaming Data Architecture - Layers Collection Tier Message Queueing Tier

    Analysis Tier In-Memory Data Store Data Access Tier Browser, Device, Machine ... Browser, Device, Machine ... *
  7. • Open-source project to create reactive applications and micro-services •

    Inspired from NodeJS ◦ Reactor pattern ◦ events • Unopinionated • Non blocking • Reactive • Scalable Eclipse Vert.x
  8. • Open-source project mainly maintained by Red Hat • 9.1

    Last Stable Release, version 9.2 in development • Cache / In-Memory Datagrid ◦ Local Cache ◦ HIbernate Cache ◦ Replicated Caches ◦ Distributed Caches ◦ Transactions ◦ Listeners ◦ Continuous Query ◦ Clustered Data Structures (Multimaps, Counters, Locks … and more coming) ◦ Client/Server Architecture - HotRod, Memcache, REST … ◦ … and more ... Infinispan
  9. • Container platform • Kubernetes plus extras • Build pipelines

    • Auto-Scaling • Service Catalog • Online and Hosted Openshift
  10. Model - 2 real streams Train Train Position Stop Timed

    Position Station GeoLoc Timestamp Lat/Lng trainId delay name id 1..N