Slide 1

Slide 1 text

© 2015 Mesosphere, Inc. All Rights Reserved. ELASTIC STREAM PROCESSING WITHOUT TEARS 1 Michael Hausenblas, Developer & Cloud Advocate | 2015-10-01 | Strata NYC

Slide 2

Slide 2 text

© 2015 Mesosphere, Inc. All Rights Reserved. WHY IS STREAM PROCESSING A THING? 2

Slide 3

Slide 3 text

© 2015 Mesosphere, Inc. All Rights Reserved. LET'S TALK ABOUT WORKLOADS* … 3 *) kudos to Timothy St. Clair, @timothysc batch streaming PaaS MapReduce

Slide 4

Slide 4 text

© 2015 Mesosphere, Inc. All Rights Reserved. AIRLINES 4

Slide 5

Slide 5 text

© 2015 Mesosphere, Inc. All Rights Reserved. LOGISTICS 5

Slide 6

Slide 6 text

© 2015 Mesosphere, Inc. All Rights Reserved. HEALTH
 CARE 6

Slide 7

Slide 7 text

© 2015 Mesosphere, Inc. All Rights Reserved. TRADERS 7

Slide 8

Slide 8 text

© 2015 Mesosphere, Inc. All Rights Reserved. FARMERS 8

Slide 9

Slide 9 text

© 2015 Mesosphere, Inc. All Rights Reserved. CITIES 9 © 2014, Wired magazine

Slide 10

Slide 10 text

© 2015 Mesosphere, Inc. All Rights Reserved. YOU 10

Slide 11

Slide 11 text

© 2015 Mesosphere, Inc. All Rights Reserved. MEET THE DATACENTER
 OPERATING
 SYSTEM
 (DCOS) 11

Slide 12

Slide 12 text

© 2015 Mesosphere, Inc. All Rights Reserved. LOCAL OS VS. DISTRIBUTED OS 12 http://bitly.com/os-vs-dcos

Slide 13

Slide 13 text

© 2015 Mesosphere, Inc. All Rights Reserved. DCOS IS A DISTRIBUTED OPERATING SYSTEM 13 • local OS per node (+container enabled) • scheduling (long-lived, batch) • networking • service discovery • stateful services • security • monitoring, logging, debugging

Slide 14

Slide 14 text

© 2015 Mesosphere, Inc. All Rights Reserved. 14

Slide 15

Slide 15 text

© 2015 Mesosphere, Inc. All Rights Reserved. 15

Slide 16

Slide 16 text

© 2015 Mesosphere, Inc. All Rights Reserved. BENEFITS 16 DCOS • Run stateless services such as Web server, app server, etc. and Big Data services like HDFS, C*, Spark, etc. together on one cluster • Dynamic partitioning of your cluster, depending on your needs (business requirements) • Increased utilization (10% → 80% an more)

Slide 17

Slide 17 text

© 2015 Mesosphere, Inc. All Rights Reserved. THE
 TOOLBOX 17

Slide 18

Slide 18 text

© 2015 Mesosphere, Inc. All Rights Reserved. • Kafka • ØMQ, RabbitMQ, Disque (Redis-based), etc. • fluentd, Logstash, Flume, etc. • Akka streams • cloud-only: AWS SQS, Google Cloud Pub/Sub • see also queues.io MESSAGE QUEUES & ROUTERS 18

Slide 19

Slide 19 text

© 2015 Mesosphere, Inc. All Rights Reserved. STREAM PROCESSING PLATFORMS 19 • Storm • Spark • Samza • Flink • Concord • cloud-only: AWS Kinesis, Google Cloud Dataflow • see also my webinar on stream processing

Slide 20

Slide 20 text

© 2015 Mesosphere, Inc. All Rights Reserved. TIME SERIES DATASTORES 20 • InfluxDB • OpenTSDB • KairosDB • Prometheus • see also iot-a.info

Slide 21

Slide 21 text

© 2015 Mesosphere, Inc. All Rights Reserved. EXAMPLE 21

Slide 22

Slide 22 text

Profiling & Benchmarking
 Streaming Systems Shinji Kim Co-founder & CEO @concord http://concord.io

Slide 23

Slide 23 text

Concord • Distributed, event-based stream processing framework • Built on top of Apache Mesos, in C++ • Simple to use, all-in-one stream processing

Slide 24

Slide 24 text

Concord

Slide 25

Slide 25 text

Benchmarking a Stream Processor • Distributed systems means 
 distributed results • You can’t profile processes as you would in a single machine • Latency measurements require instrumentation

Slide 26

Slide 26 text

Prior approaches Benchmarking Apache Samza: 1.2 million messages per second on a single node: https://engineering.linkedin.com/performance/benchmarking-apache- samza-12-million-messages-second-single-node

Slide 27

Slide 27 text

Prior approaches • Benchmarking Scenarios – Message passing – Key counting in memory • Isolating framework performance vs. 
 User code performance • Sampling throughput in 1 second windows

Slide 28

Slide 28 text

Our Approach • Be realistic - Kafka as a data source – Convenient way to regulate data flow • Frequency counting as a simple task – Demonstrates the correctness of the framework, not accidently benchmarking C++ vs. Java, etc. – Dictionary limited to 9000 words to avoid 
 excess memory allocation / pressure • Sample msg throughput for data source & sink • End-to-end latency (Concord only)

Slide 29

Slide 29 text

Setup Each cluster has 6 nodes: • n1-standard-4:
 4 vCPUs, 15 GB RAM, 160 GB SSD • One “master”, 5 “workers” • Kafka prefilled with 1.13 billion messages (random words) • One worker dedicated to consume from Kafka • Remaining workers process msgs & log results

Slide 30

Slide 30 text

Setup

Slide 31

Slide 31 text

Test problem Key counting (single node & 5 node cluster) • 3 operator topology, a => b => c • a reads from a queue • b counts words – with every tuple, updated count emitted down stream • c writes the result into a log file as CSV plaintext – word, frequency • Log files to be post processed to determine accuracy

Slide 32

Slide 32 text

Test problem Word counting (single node & 5 node cluster) • 3 operator topology • Log files to be post processed to determine accuracy

Slide 33

Slide 33 text

Results (in-progress) Storm • Single node throughput: 16,000 msgs / sec • Cluster-wide throughput: 65,000 msgs / sec Concord • Single node throughput: 100,000 msgs / sec • Cluster-wide throughput: pending

Slide 34

Slide 34 text

Lessons Learned • It’s hard to setup each of these systems • Measuring latency is tricky – Requires instrumentation – The ability to follow a message all the way through the processing pipeline • Necessary to isolate Kafka consumer performance

Slide 35

Slide 35 text

Future Plans • Finish benchmarking for Spark Streaming & Concord • Scale up the Kafka cluster • Isolate the performance of Kafka consumers • Optimization efforts • Other frameworks like Samza, Flink, etc. • Instrument non-Concord frameworks with tracing to measure end-to-end latency

Slide 36

Slide 36 text

Questions & Feedback? We know this is far from perfect, but we had to start somewhere… :) Sign up for our office hours at: http://bit.ly/concordoh   [email protected]

Slide 37

Slide 37 text

© 2015 Mesosphere, Inc. All Rights Reserved. MESOSPHERE IS HIRING, WORLDWIDE … San Francisco New York Hamburg https://mesosphere.com/careers/

Slide 38

Slide 38 text

© 2015 Mesosphere, Inc. All Rights Reserved. Q & A 38 • @mhausenblas • mhausenblas.info • @mesosphere • mesosphere.io/product • mesosphere.com/infinity