Slide 1

Slide 1 text

© 2016 Mesosphere, Inc. All Rights Reserved. ELASTIC DATA PIPELINES WITH
 DC/OS ON AZURE 1 Michael Hausenblas | 2016-05-17 | 37th BigData.be Meetup, Brussels

Slide 2

Slide 2 text

© 2016 Mesosphere, Inc. All Rights Reserved. sys admin devops developer architect data engineer data scientist

Slide 3

Slide 3 text

© 2015 Mesosphere, Inc. All Rights Reserved. LET'S TALK ABOUT WORKLOADS* … 3 *) kudos to Timothy St. Clair, @timothysc batch streaming PaaS MapReduce

Slide 4

Slide 4 text

© 2015 Mesosphere, Inc. All Rights Reserved. • Apache Kafka • ØMQ, RabbitMQ, Disque (Redis-based), etc. • fluentd, Logstash, Flume • Akka streams • cloud-only: AWS SQS, Google Cloud Pub/Sub • see also queues.io MESSAGE QUEUES & ROUTERS 4

Slide 5

Slide 5 text

© 2015 Mesosphere, Inc. All Rights Reserved. STREAM PROCESSING PLATFORMS 5 • Apache Storm • Apache Spark • Apache Samza • Apache Flink • Concord • cloud-only: AWS Kinesis, Google Cloud Dataflow • see also my webinar on stream processing

Slide 6

Slide 6 text

© 2015 Mesosphere, Inc. All Rights Reserved. TIME SERIES DATASTORES 6 • InfluxDB • OpenTSDB • KairosDB • Prometheus • see also iot-a.info

Slide 7

Slide 7 text

© 2015 Mesosphere, Inc. All Rights Reserved. CHALLENGES 7 • Set up and operation of components • Elasticity: static vs. dynamic partitioning • Efficient usage of resources (utilization/TCO)

Slide 8

Slide 8 text

© 2016 Mesosphere, Inc. All Rights Reserved. TIME FOR A NEW KIND OF OPERATING SYSTEM 8

Slide 9

Slide 9 text

© 2016 Mesosphere, Inc. All Rights Reserved. SINGLE MACHINE APPLICATION 9 hardware OS app

Slide 10

Slide 10 text

© 2016 Mesosphere, Inc. All Rights Reserved. DISTRIBUTED APPLICATION 10 hardware OS app hardware OS hardware OS hardware OS hardware OS hardware OS hardware OS

Slide 11

Slide 11 text

© 2016 Mesosphere, Inc. All Rights Reserved. DISTRIBUTED OS + DISTRIBUTED APP 11 hardware OS app hardware OS hardware OS hardware OS hardware OS hardware OS hardware OS distributed OS

Slide 12

Slide 12 text

© 2015 Mesosphere, Inc. All Rights Reserved. CONTAINERS 12

Slide 13

Slide 13 text

© 2016 Mesosphere, Inc. All Rights Reserved. CONTAINER ARTIFACTS LAYER DIAGRAM 13

Slide 14

Slide 14 text

© 2016 Mesosphere, Inc. All Rights Reserved. LINUX
 CONTAINERS 14 The why and the what: • Containers vs VMs • app-level dependency management • lightweight (startup time, footprint, average runtime) • isolation & security

Slide 15

Slide 15 text

© 2016 Mesosphere, Inc. All Rights Reserved. LINUX
 CONTAINERS 15 • namespaces • Isolate PIDs between processes • Isolate process to network resources • Isolate the hostname to fake it out (UTS) • Isolate the filesystem mount points (chroot) • Isolate inter process communication (IPC) • Isolate specific users to specific processes • cgroups
 https://sysadmincasts.com/episodes/14-introduction-to-linux-control-groups-cgroups

Slide 16

Slide 16 text

© 2015 Mesosphere, Inc. All Rights Reserved. MEET THE DATACENTER OPERATING SYSTEM 16

Slide 17

Slide 17 text

© 2016 Mesosphere, Inc. All Rights Reserved. LOCAL OS
 VS
 DISTRIBUTED OS 17

Slide 18

Slide 18 text

© 2016 Mesosphere, Inc. All Rights Reserved. DC/OS ARCHITECTURE 18

Slide 19

Slide 19 text

© 2016 Mesosphere, Inc. 19 https://dcos.io

Slide 20

Slide 20 text

© 2016 Mesosphere, Inc. 20 https://dcos.io

Slide 21

Slide 21 text

© 2016 Mesosphere, Inc. All Rights Reserved. DC/OS BENEFITS 21 • One cluster for • stateless services such as Web servers & app servers (via Marathon) • stateful services like PostgreSQL, MemSQL, Kafka, Cassandra, etc. • elastic data processing via Spark, Akka, etc. • CI/CD, for example Jenkins+Marathon • Dynamic partitioning of your cluster, depending on your needs • Increased utilization (10% → 80%+)

Slide 22

Slide 22 text

© 2016 Mesosphere, Inc. All Rights Reserved. 22

Slide 23

Slide 23 text

© 2016 Mesosphere, Inc. All Rights Reserved. 23

Slide 24

Slide 24 text

© 2015 Mesosphere, Inc. All Rights Reserved. ELASTIC DATA PIPELINES 24

Slide 25

Slide 25 text

© 2015 Mesosphere, Inc. Hands-on … 25

Slide 26

Slide 26 text

© 2015 Mesosphere, Inc. All Rights Reserved. 26 A SLIGHTLY MORE COMPLEX EXAMPLE mesosphere.com/blog/2015/11/18/dcos-time-series-demo/

Slide 27

Slide 27 text

© 2015 Mesosphere, Inc. All Rights Reserved. LEARNING RESOURCES 27

Slide 28

Slide 28 text

© 2016 Mesosphere, Inc. All Rights Reserved. 28 WHERE CAN
 I LEARN MORE? http://shop.oreilly.com/product/9781939902184.do 28 http://shop.oreilly.com/product/0636920035671.do

Slide 29

Slide 29 text

© 2016 Mesosphere, Inc. All Rights Reserved. 29 WHERE CAN
 I LEARN MORE? 29 https://www.nginx.com/resources/library/docker-networking/

Slide 30

Slide 30 text

© 2016 Mesosphere, Inc. All Rights Reserved. 30 WHERE CAN
 I LEARN MORE? http://shop.oreilly.com/product/0636920039952.do https://manning.com/books/mesos-in-action 30

Slide 31

Slide 31 text

© 2016 Mesosphere, Inc. All Rights Reserved. Q & A 31 • @mhausenblas • mhausenblas.info • [email protected] https://dcos.io