Big Data In Action with Infinispan

Big Data In Action with Infinispan

Dealing with real-time, in-memory, streaming data is a unique challenge and with the advent of the smartphone and IoT (trillions of internet connected devices), we are witnessing an exponential growth in data at scale.

Building data layers that can satisfy these requirements can be challenging, but with the help of Infinispan, an in-memory data grid from Red Hat, you can take advantage of state of the art distributed data processing capabilities to tackle these challenges. From classic or full-text queries, to Spark/Hadoop integrations via distributed Java Streams, these wide ranging data processing capabilities make Infinispan the perfect choice for the Big Data era.

In this session, we will identify critical patterns and principles that will help you achieve greater scale and response speed. On top of that, you will witness how Infinispan follows these patterns and principles to tackle a big data situation via a live coding demonstration.

5438f857ad449f373323e64a763365c5?s=128

Galder Zamarreño

April 27, 2017
Tweet

Transcript

  1. Big Data in action with infinispan Galder Zamarreño Arrizabalaga 27th

    April 2017
  2. Moi • @ • Infinispan developer & co-founder • Lead

    client/server architecture • Functional programming @galderz #infinispan
  3. Build Infinispan based infrastructure to store, search and process near

    real-time data and calculate analytics
  4. real-time data Real-time data is challenging Delays can have big

    impact
  5. data growth Exponential data growth (smartphone, IoT...etc) How to analyse

    it?
  6. in-memory data grids IMDG

  7. What is a imdg? • Distributed in-memory data • Server

    "mesh" • Peer-to-peer (P2P) • No master/slaves • No single bottleneck • No single point failure • Commodity hardware
  8. infinispan is a imdg Custom Applications Mobile Applications Web Apps

    & Websites JBoss Middleware Fuse "memory" across machines into a unified data store Read-through, write-through, write-behind • NoSQL • Extreme Performance • Linear Scalability • Fault Tolerant • Event processing • Configurable ACID Txn Infinispan Databases and/or file system Analytical Framework
  9. Infinispan Use Cases Distributed Cache cache frequent data transient short-lived

    storage
  10. Infinispan Use Cases Distributed Cache cache frequent data transient short-lived

    storage NoSQL Database key/value store ACID transactions
  11. Infinispan Use Cases Distributed Cache cache frequent data transient short-lived

    storage NoSQL Database key/value store ACID transactions Event Broker listen to data changes continuous query
  12. Infinispan Use Cases Distributed Cache cache frequent data transient short-lived

    storage NoSQL Database key/value store ACID transactions Event Broker listen to data changes continuous query Data Analytics map/ reduce via java stream spark/ hadoop integration
  13. use examples • Web, Ecommerce • HTTP session • Shopping

    carts • Database/legacy offload: • Product catalog • Caching • Telecommunications • Cellular billings • Call routing, session info, • SMS content/notification • Travel • Aggregated flight pricing • Availability flights • Financial • Per-user portfolio data and risk analysis • Aggregated ticker stream • Defence • Sensor network data process and threat detection
  14. Infinispan Use Cases Distributed Cache cache frequent data transient short-lived

    storage NoSQL Database key/value store ACID transactions Event Broker listen to data changes continuous query Data Analytics map/ reduce via java stream spark/ hadoop integration
  15. Event Broker listen to data changes continuous query

  16. continuous query Continuous Query combines complex querying with reactive data

    changes
  17. Demo Domain Station board stop station train

  18. Openshift Platform-as-a-Service (PaaS) Public or private Polyglot Based on Docker

    & Kubernetes
  19. Vert.x Tool-kit for building reactive applications on the JVM Event-Driven

    & Non-Blocking Polyglot
  20. Real-Time Demo Continuous Query Verticle Http App Verticle Data Grid

    Replication Sock JS Bridge Real Time Laptop Http Websockets JavaFX Injector Verticle
  21. DEmo Real-time

  22. Data Analytics map/ reduce via java stream spark/ hadoop integration

  23. spark - hadoop Powerful analytics APIs Combo with Infinispan backend

    Separate process management
  24. distributed java streams Extended Java 8 Stream API to data

    stored in
  25. java 8 stream List<Integer> numbers = Arrays.asList( 4, 74, 20,

    97, 118, 50, 97, 34, 48); numbers.stream() .filter(i -> i > 70) // ^ Returns Stream<Integer> .map(n -> new String(Character.toChars(n))) // ^ Returns Stream<String> .reduce("", String::concat); Returns "Java"
  26. Distributed streams map(λ) λ λ

  27. What is the time of the day when there is

    the biggest ratio of delayed trains?
  28. Analytics Demo Data Grid Replication Delay Calculator Server Task Delay

    Calculator Server Task Delay Calculator Server Task Analytics Verticle Injector Verticle Analytics Jupyter Laptop HTTP
  29. Demo ANalytics

  30. Build Infinispan based infrastructure to store, search and process near

    real-time data and calculate analytics real-time data challenge
  31. Build Infinispan based infrastructure to store, search and process near

    real-time data and calculate analytics data growth problem real-time data challenge
  32. Build Infinispan based infrastructure to store, search and process near

    real-time data and calculate analytics real-time data challenge data growth problem continuous query for real-time
  33. Build Infinispan based infrastructure to store, search and process near

    real-time data and calculate analytics real-time data challenge data growth problem continuous query for real-time analysis with java streams
  34. credits Approve by Aha-Soft from the Noun Project engineer by

    Wilson Joseph from the Noun Project transformation by Felipe Perucho from the Noun Project analytics by Roman Kovbasyuk from the Noun Project Database sharing by YuguDesign from the Noun Project Server by designify.me from the Noun Project
  35. Thanks! • github.com/galderz/swiss-transport-datagrid • Branch: early17 • infinispan.org • openshift.com

    • vertx.io @galderz #infinispan