Slide 1

Slide 1 text

Big Data in action with infinispan Galder Zamarreño Arrizabalaga 7th September 2017

Slide 2

Slide 2 text

Build Infinispan based infrastructure to store, search and process near real-time data and calculate analytics

Slide 3

Slide 3 text

real-time data Real-time data is challenging Delays can have big impact

Slide 4

Slide 4 text

data growth Exponential data growth (smartphone, IoT...etc) How to analyse it?

Slide 5

Slide 5 text

in-memory data grids IMDG

Slide 6

Slide 6 text

What is a imdg? • Distributed in-memory data • Server "mesh" • Peer-to-peer (P2P) • No master/slaves • No single bottleneck • No single point failure • Commodity hardware

Slide 7

Slide 7 text

infinispan is a imdg Custom Applications Mobile Applications Web Apps & Websites JBoss Middleware Fuse "memory" across machines into a unified data store Read-through, write-through, write-behind • NoSQL • Extreme Performance • Linear Scalability • Fault Tolerant • Event processing • Configurable ACID Txn Infinispan Databases and/or file system Analytical Framework

Slide 8

Slide 8 text

Infinispan Use Cases Event Broker listen to data changes continuous query Data Analytics map/ reduce via java stream spark/ hadoop integration Distributed Cache cache frequent data transient short-lived storage NoSQL Database key/value store ACID transactions

Slide 9

Slide 9 text

Event Broker listen to data changes continuous query

Slide 10

Slide 10 text

continuous query Continuous Query combines complex querying with reactive data changes

Slide 11

Slide 11 text

Demo Domain Station board stop station train

Slide 12

Slide 12 text

Openshift Platform-as-a-Service (PaaS) Public or private Polyglot Based on Docker & Kubernetes

Slide 13

Slide 13 text

Vert.x Tool-kit for building reactive applications on the JVM Event-Driven & Non-Blocking Polyglot

Slide 14

Slide 14 text

Real-Time Demo Continuous Query Verticle Http App Verticle Data Grid Replication Sock JS Bridge Real Time Laptop Http Websockets JavaFX Injector Verticle

Slide 15

Slide 15 text

DEmo Real-time

Slide 16

Slide 16 text

Data Analytics map/ reduce via java stream spark/ hadoop integration

Slide 17

Slide 17 text

spark - hadoop Powerful analytics APIs Combo with Infinispan backend Separate process management

Slide 18

Slide 18 text

distributed java streams Extended Java 8 Stream API to data stored in

Slide 19

Slide 19 text

Distributed streams map(λ) λ λ

Slide 20

Slide 20 text

What is the time of the day when there is the biggest ratio of delayed trains?

Slide 21

Slide 21 text

Analytics Demo Data Grid Replication Delay Calculator Server Task Delay Calculator Server Task Delay Calculator Server Task Analytics Verticle Injector Verticle Analytics Jupyter Laptop HTTP

Slide 22

Slide 22 text

Demo ANalytics

Slide 23

Slide 23 text

Build Infinispan based infrastructure to store, search and process near real-time data and calculate analytics

Slide 24

Slide 24 text

credits Approve by Aha-Soft from the Noun Project engineer by Wilson Joseph from the Noun Project transformation by Felipe Perucho from the Noun Project analytics by Roman Kovbasyuk from the Noun Project Database sharing by YuguDesign from the Noun Project Server by designify.me from the Noun Project

Slide 25

Slide 25 text

Thanks! • github.com/infinispan-demos/swiss-transport-datagrid • Branch: early17 • infinispan.org • openshift.com • vertx.io @galderz #infinispan