Big Data In Action with Infinispan

Big Data in action with infinispan Galder Zamarreño Arrizabalaga 27th
April 2017

Moi • @ • Infinispan developer & co-founder • Lead
client/server architecture • Functional programming @galderz #infinispan

Build Infinispan based infrastructure to store, search and process near
real-time data and calculate analytics

real-time data Real-time data is challenging Delays can have big
impact

data growth Exponential data growth (smartphone, IoT...etc) How to analyse
it?

in-memory data grids IMDG

What is a imdg? • Distributed in-memory data • Server
"mesh" • Peer-to-peer (P2P) • No master/slaves • No single bottleneck • No single point failure • Commodity hardware

infinispan is a imdg Custom Applications Mobile Applications Web Apps
& Websites JBoss Middleware Fuse "memory" across machines into a unified data store Read-through, write-through, write-behind • NoSQL • Extreme Performance • Linear Scalability • Fault Tolerant • Event processing • Configurable ACID Txn Infinispan Databases and/or file system Analytical Framework

Infinispan Use Cases Distributed Cache cache frequent data transient short-lived
storage

storage NoSQL Database key/value store ACID transactions

storage NoSQL Database key/value store ACID transactions Event Broker listen to data changes continuous query

storage NoSQL Database key/value store ACID transactions Event Broker listen to data changes continuous query Data Analytics map/ reduce via java stream spark/ hadoop integration

use examples • Web, Ecommerce • HTTP session • Shopping
carts • Database/legacy offload: • Product catalog • Caching • Telecommunications • Cellular billings • Call routing, session info, • SMS content/notification • Travel • Aggregated flight pricing • Availability flights • Financial • Per-user portfolio data and risk analysis • Aggregated ticker stream • Defence • Sensor network data process and threat detection

storage NoSQL Database key/value store ACID transactions Event Broker listen to data changes continuous query Data Analytics map/ reduce via java stream spark/ hadoop integration

Event Broker listen to data changes continuous query

continuous query Continuous Query combines complex querying with reactive data
changes

Demo Domain Station board stop station train

Openshift Platform-as-a-Service (PaaS) Public or private Polyglot Based on Docker
& Kubernetes

Vert.x Tool-kit for building reactive applications on the JVM Event-Driven
& Non-Blocking Polyglot

Real-Time Demo Continuous Query Verticle Http App Verticle Data Grid
Replication Sock JS Bridge Real Time Laptop Http Websockets JavaFX Injector Verticle

DEmo Real-time

Data Analytics map/ reduce via java stream spark/ hadoop integration

spark - hadoop Powerful analytics APIs Combo with Infinispan backend
Separate process management

distributed java streams Extended Java 8 Stream API to data
stored in

java 8 stream List<Integer> numbers = Arrays.asList( 4, 74, 20,
97, 118, 50, 97, 34, 48); numbers.stream() .filter(i -> i > 70) // ^ Returns Stream<Integer> .map(n -> new String(Character.toChars(n))) // ^ Returns Stream<String> .reduce("", String::concat); Returns "Java"

Distributed streams map(λ) λ λ

What is the time of the day when there is
the biggest ratio of delayed trains?

Analytics Demo Data Grid Replication Delay Calculator Server Task Delay
Calculator Server Task Delay Calculator Server Task Analytics Verticle Injector Verticle Analytics Jupyter Laptop HTTP

Demo ANalytics

real-time data and calculate analytics real-time data challenge

real-time data and calculate analytics data growth problem real-time data challenge

real-time data and calculate analytics real-time data challenge data growth problem continuous query for real-time

real-time data and calculate analytics real-time data challenge data growth problem continuous query for real-time analysis with java streams

credits Approve by Aha-Soft from the Noun Project engineer by
Wilson Joseph from the Noun Project transformation by Felipe Perucho from the Noun Project analytics by Roman Kovbasyuk from the Noun Project Database sharing by YuguDesign from the Noun Project Server by designify.me from the Noun Project

Thanks! • github.com/galderz/swiss-transport-datagrid • Branch: early17 • infinispan.org • openshift.com
• vertx.io @galderz #infinispan

Big Data In Action with Infinispan

Big Data In Action with Infinispan

More Decks by Galder Zamarreño

Other Decks in Programming

Featured

Transcript