Slide 1

Slide 1 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Streaming solutions for real time problems Abhishek Gupta @abhi_tweeter Senior Product Manager, Oracle Oct 2, 2017

Slide 2

Slide 2 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Slide 3

Slide 3 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. |

Slide 4

Slide 4 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Before we dive in… • Goal – Using a practical example, familiarize you with a tech stack for dealing with fast/real time/streaming data • Agenda – 101s - Kafka, Kafka Streams & Redis – Sample app & implementation (using Oracle Cloud) – Q & A • Content – Slideshare – Github

Slide 5

Slide 5 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Real time

Slide 6

Slide 6 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | (traditional) Batch solution EVENTS EVENTS EVENTS DWH Aggregate Batch processing Static view of insights

Slide 7

Slide 7 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | (traditional) Messaging based solution Message Broker EVENTS EVENTS EVENTS DB App Consumer Polling etc. 1. Designed for in-memory 2. Consume and delete Stream Processing @ scale ?? DIY !

Slide 8

Slide 8 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Stream processing to the rescue! • Streams – Unbounded/infinite data set – Has volume and velocity. Not just Big, but fast data • Stream Processing – Crunching/processing streams of data.. asap! – Req-response – Streaming - Batch – Time, ordering, state etc. http://www.capturearkansas.com/photos/550197

Slide 9

Slide 9 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Use Case: Data center monitoring application • Collect (simulate) metrics from multiple machines • Crunch statistics (moving average) • Monitor using a dashboard data: {"machine":"machine-1","metrics":["8","20","36","65","2","20","73","67"]} data: {"machine":"machine-2","metrics":["1","54","42","61","40","35","26","78”]} . . . .

Slide 10

Slide 10 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Tech stack for a Streaming solution Partitions Partitions Lists Sorted Set Service App UI SSE Kafka - Event Store Kafka Streams - Processor Redis – State Store Dashboard Simulated Producer

Slide 11

Slide 11 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Apache Kafka: the Event Store Partitions Partitions Lists Sorted Set Service App UI SSE Kafka - Event Store Kafka Streams - Processor Redis – State Store Dashboard Simulated Producer

Slide 12

Slide 12 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Apache Kafka Originally built @ LinkedIn OSS in early 2011 Late 2012 – ASF top level 50,000 foot view History

Slide 13

Slide 13 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Topics machine1-59 machine3-23 machine5-42 machine6-43 machine2-17 …. cpu-metrics

Slide 14

Slide 14 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Partitions https://kafka.apache.org On disk

Slide 15

Slide 15 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Replication (and partitioning) in action Humble beginning – single node

Slide 16

Slide 16 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Replication (and partitioning) in action Scale out… https://simplydistributed.wordpress.com/2016/12/13/kafka-partitioning/ https://svn.apache.org/repos/asf/zookeeper/logo/zook eeper.jpg Zookeeper

Slide 17

Slide 17 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Producers https://kafka.apache.org What goes where ??

Slide 18

Slide 18 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Consumers https://kafka.apache.org Pub-sub Queue Kafka

Slide 19

Slide 19 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Managed Kafka: Oracle Event Hub Cloud

Slide 20

Slide 20 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Metrics Topic: Oracle Event Hub Cloud

Slide 21

Slide 21 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Event producer: Oracle Application Container Cloud

Slide 22

Slide 22 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | So… What is Kafka ?? • At its core: a distributed commit log • Messaging system (Pub Sub + Queue) • Reactive (& sharded) key-value store • Database – read this and check out KSQL (a streaming SQL engine for Kafka) • Data pipeline – thanks to Kafka Connect • Streaming platform – stay awake to learn more on this !

Slide 23

Slide 23 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Kafka Streams: processing engine Partitions Partitions Lists Sorted Set Service App UI SSE Kafka - Event Store Kafka Streams - Processor Redis – State Store Dashboard Simulated Producer

Slide 24

Slide 24 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | • Streams API: no need to deal with the Kafka Consumer, Producer API explicitly • Use cases – big data, fast data, microservices, monoliths etc. • Piggy backs on Kafka for scalability & fault-tolerance • One-record-at-a-time processing (no micro batching) • Separate infra isn’t mandatory – think about Spark, Storm etc. – deploy (and scale) anywhere – its just a Java app after all! • Programming styles: High (fluent DSL) and low level (Processor) APIs • Stateful processing support + Interactive queries • Windowing, aggregations, joins etc. Kafka Streams: what is it ?

Slide 25

Slide 25 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Kafka Streams: APIs (High level) Fluent DSL API (Low level) Processor API

Slide 26

Slide 26 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Kafka Streams: Topology https://kafka.apache.org conceptually At runtime

Slide 27

Slide 27 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Scaling a Kafka Streams app p1 p2 p3 p4 Thread-1 Instance-1 Task 1 Task 2 Task 3 Task 4 Thread-1 Task 3 Task 4 Instance-2 my-topic Stream partitions Scale out

Slide 28

Slide 28 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Scaling out is not the only option • Techniques – Scale OUT – more instances – Scale UP – more threads • Max parallelism – [No. of topic partitions / no. of threads per instance] e.g. 50 / 5 = 10 https://issues.apache.org/jira/browse/KAFKA-5683

Slide 29

Slide 29 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Stateful stream processing with Kafka Streams State stores • Conceptually: lightweight embedded database within your stream processing layer to store ‘intermediate’ processing state (state is local to each app instance) • Options: in-memory, persistent (RocksDB), custom store (e.g. external DB) • State stores expose their internals using Interactive Queries Interactive queries • No additional data store.. Just ask your app ! • Needs some dev work to make your app (interactively) query-able

Slide 30

Slide 30 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Stateful processing & (interactive) querying Kafka External app Custom RPC layer (e.g. REST API) machine1:8080 machine2:8080 Local state stores App Instance 1 App instance 2 application.server config + StreamsMetadata API Query and get back the ‘complete’ state using custom API

Slide 31

Slide 31 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Interactive queries in action Blog - http://bit.ly/2fK1Io5

Slide 32

Slide 32 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Fault tolerance – for stateless and stateful apps (internal) Compacted topic k1-v1 k2-v2 Local state stores App Instance 1 App Instance 2 (app specific) Data topic Kafka k3-v3 k4-v4 k1-v1 k2-v2

Slide 33

Slide 33 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Kafka Streams processing app: Oracle Application Container Cloud Let’s not forget about scale out ! Metrics Processor Metrics Processor Kafka

Slide 34

Slide 34 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Redis: the State Store Partitions Partitions Lists Sorted Set Service App UI SSE Kafka - Event Store Kafka Streams - Processor Redis – State Store Dashboard Simulated Producer

Slide 35

Slide 35 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | • Stands for: RE(mote) DI(ctionary) S(erver) • Versatile data structure server (written in C) • Focus on in-memory with (tunable) persistence •Not just any KV store • Keys – From a simple string to binary – Max 512 MB (same for values) – Can be expired • Values – any of the following – String, List, Hash – Set, Sorted Set – Geospatial, HyperLogLog – etc. Hello

Slide 36

Slide 36 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Redis data structures • Sorted Sets – Each element has an associated score (basis for sort) – Basic Ops: ZADD, ZINCRBY, ZREM, ZSCORE, ZCARD – View: ZRANGEBYSCORE, ZREVRANGEBYSCORE – Ranking: ZRANK, ZREVRANK • Lists – To be specific: a Linked List – Operations at head (LPUSH) & tail (RPUSH) are O(1), search by index is O(N) – LRANGE, RPOP, LPOP to extract data & LTRIM to cap the size – Blocking ops: BLPOP, BRPOP https://redis.io/commands

Slide 37

Slide 37 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | etc…… • Good stuff – Redis Sentinel (HA), Master-Slave replication, Redis Cluster for partitioning, Pub Sub, Transactions, Lua scripting • Use cases: Messaging, Cache, Job Queue, Live leader board, counting stuff (efficiently), analytics, location based (Geospatial) etc. • Client libraries – Java, Scala, Go, Python, C++…. – https://redis.io/clients

Slide 38

Slide 38 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | State store FAQs • Redis vs Kafka Streams state store – Horses for courses! • Can we combine both ? – Depending on the use case, yes! • Oh and you can also use the Cache which comes with Oracle Application Container Cloud ! – (Yet another) Blog - http://bit.ly/2yEN35q

Slide 39

Slide 39 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Redis: Oracle Cloud Infrastructure 1 2

Slide 40

Slide 40 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Monitoring Dashboard Partitions Partitions Lists Sorted Set Service App UI SSE Kafka - Event Store Kafka Streams - Processor Redis – State Store Dashboard Simulated Producer

Slide 41

Slide 41 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Dashboard app: Oracle Application Container Cloud • JAX-RS & (Jersey) Server Sent Events • CDI: Jedis (Redis) client @Producer • EJB: TimerService and @Asynchronous • Others: Jackson Note: SSE and JSON-B are available in Java EE 8

Slide 42

Slide 42 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | (Oracle) Cloud based Streaming solution Partitions Partitions Lists Sorted Set Service App UI SSE Kafka - Event Store Kafka Streams - Processor Redis – State Store Dashboard Simulated Producer Oracle Application Container Cloud Oracle Event Hub Cloud Oracle Compute Cloud Oracle Application Container Cloud Oracle Application Container Cloud

Slide 43

Slide 43 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Demo

Slide 44

Slide 44 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Resources • Oracle Application Container Cloud tutorials • Oracle Stack Manager – Infrastructure-as-code • Oracle PSM CLI – the cli-of-everything (in Oracle PaaS!) • Oracle Devs on Medium (blog) and Twitter • Try Oracle Cloud !

Slide 45

Slide 45 text

Copyright © 2017, Oracle and/or its affiliates. All rights reserved. | Sessions which you should check out!

Slide 46

Slide 46 text

No content