Elastic Data Pipelines with DC/OS on Microsoft Azure

© 2016 Mesosphere, Inc. All Rights Reserved. ELASTIC DATA PIPELINES
WITH  DC/OS ON AZURE 1 Michael Hausenblas | 2016-05-17 | 37th BigData.be Meetup, Brussels

© 2016 Mesosphere, Inc. All Rights Reserved. sys admin devops
developer architect data engineer data scientist

© 2015 Mesosphere, Inc. All Rights Reserved. LET'S TALK ABOUT
WORKLOADS* … 3 *) kudos to Timothy St. Clair, @timothysc batch streaming PaaS MapReduce

© 2015 Mesosphere, Inc. All Rights Reserved. • Apache Kafka
• ØMQ, RabbitMQ, Disque (Redis-based), etc. • ﬂuentd, Logstash, Flume • Akka streams • cloud-only: AWS SQS, Google Cloud Pub/Sub • see also queues.io MESSAGE QUEUES & ROUTERS 4

© 2015 Mesosphere, Inc. All Rights Reserved. STREAM PROCESSING PLATFORMS
5 • Apache Storm • Apache Spark • Apache Samza • Apache Flink • Concord • cloud-only: AWS Kinesis, Google Cloud Dataﬂow • see also my webinar on stream processing

© 2015 Mesosphere, Inc. All Rights Reserved. TIME SERIES DATASTORES
6 • InﬂuxDB • OpenTSDB • KairosDB • Prometheus • see also iot-a.info

© 2015 Mesosphere, Inc. All Rights Reserved. CHALLENGES 7 •
Set up and operation of components • Elasticity: static vs. dynamic partitioning • Efﬁcient usage of resources (utilization/TCO)

© 2016 Mesosphere, Inc. All Rights Reserved. TIME FOR A
NEW KIND OF OPERATING SYSTEM 8

© 2016 Mesosphere, Inc. All Rights Reserved. SINGLE MACHINE APPLICATION
9 hardware OS app

© 2016 Mesosphere, Inc. All Rights Reserved. DISTRIBUTED APPLICATION 10
hardware OS app hardware OS hardware OS hardware OS hardware OS hardware OS hardware OS

© 2016 Mesosphere, Inc. All Rights Reserved. DISTRIBUTED OS +
DISTRIBUTED APP 11 hardware OS app hardware OS hardware OS hardware OS hardware OS hardware OS hardware OS distributed OS

© 2016 Mesosphere, Inc. All Rights Reserved. CONTAINER ARTIFACTS LAYER
DIAGRAM 13

© 2016 Mesosphere, Inc. All Rights Reserved. LINUX  CONTAINERS 14
The why and the what: • Containers vs VMs • app-level dependency management • lightweight (startup time, footprint, average runtime) • isolation & security

© 2016 Mesosphere, Inc. All Rights Reserved. LINUX  CONTAINERS 15
• namespaces • Isolate PIDs between processes • Isolate process to network resources • Isolate the hostname to fake it out (UTS) • Isolate the filesystem mount points (chroot) • Isolate inter process communication (IPC) • Isolate specific users to specific processes • cgroups  https://sysadmincasts.com/episodes/14-introduction-to-linux-control-groups-cgroups

© 2015 Mesosphere, Inc. All Rights Reserved. MEET THE DATACENTER
OPERATING SYSTEM 16

© 2016 Mesosphere, Inc. All Rights Reserved. DC/OS BENEFITS 21
• One cluster for • stateless services such as Web servers & app servers (via Marathon) • stateful services like PostgreSQL, MemSQL, Kafka, Cassandra, etc. • elastic data processing via Spark, Akka, etc. • CI/CD, for example Jenkins+Marathon • Dynamic partitioning of your cluster, depending on your needs • Increased utilization (10% → 80%+)

I LEARN MORE? 29 https://www.nginx.com/resources/library/docker-networking/

I LEARN MORE? http://shop.oreilly.com/product/0636920039952.do https://manning.com/books/mesos-in-action 30

Elastic Data Pipelines with DC/OS on Microsoft ...

Elastic Data Pipelines with DC/OS on Microsoft Azure

Michael Hausenblas

More Decks by Michael Hausenblas

Other Decks in Technology

Featured

Transcript

© 2016 Mesosphere, Inc. All Rights Reserved. ELASTIC DATA PIPELINES

© 2016 Mesosphere, Inc. All Rights Reserved. sys admin devops

© 2015 Mesosphere, Inc. All Rights Reserved. LET'S TALK ABOUT

© 2015 Mesosphere, Inc. All Rights Reserved. • Apache Kafka

© 2015 Mesosphere, Inc. All Rights Reserved. STREAM PROCESSING PLATFORMS

© 2015 Mesosphere, Inc. All Rights Reserved. TIME SERIES DATASTORES

© 2015 Mesosphere, Inc. All Rights Reserved. CHALLENGES 7 •

© 2016 Mesosphere, Inc. All Rights Reserved. TIME FOR A

© 2016 Mesosphere, Inc. All Rights Reserved. SINGLE MACHINE APPLICATION

© 2016 Mesosphere, Inc. All Rights Reserved. DISTRIBUTED APPLICATION 10

© 2016 Mesosphere, Inc. All Rights Reserved. DISTRIBUTED OS +

© 2015 Mesosphere, Inc. All Rights Reserved. CONTAINERS 12

© 2016 Mesosphere, Inc. All Rights Reserved. CONTAINER ARTIFACTS LAYER

© 2016 Mesosphere, Inc. All Rights Reserved. LINUX  CONTAINERS 14

© 2016 Mesosphere, Inc. All Rights Reserved. LINUX  CONTAINERS 15

© 2015 Mesosphere, Inc. All Rights Reserved. MEET THE DATACENTER

© 2016 Mesosphere, Inc. All Rights Reserved. LOCAL OS  VS

© 2016 Mesosphere, Inc. All Rights Reserved. DC/OS ARCHITECTURE 18

© 2016 Mesosphere, Inc. 19 https://dcos.io

© 2016 Mesosphere, Inc. 20 https://dcos.io

© 2016 Mesosphere, Inc. All Rights Reserved. DC/OS BENEFITS 21

© 2016 Mesosphere, Inc. All Rights Reserved. 22

© 2016 Mesosphere, Inc. All Rights Reserved. 23

© 2015 Mesosphere, Inc. All Rights Reserved. ELASTIC DATA PIPELINES

© 2015 Mesosphere, Inc. Hands-on … 25

© 2015 Mesosphere, Inc. All Rights Reserved. 26 A SLIGHTLY

© 2015 Mesosphere, Inc. All Rights Reserved. LEARNING RESOURCES 27

© 2016 Mesosphere, Inc. All Rights Reserved. 28 WHERE CAN

© 2016 Mesosphere, Inc. All Rights Reserved. 29 WHERE CAN

© 2016 Mesosphere, Inc. All Rights Reserved. 30 WHERE CAN

© 2016 Mesosphere, Inc. All Rights Reserved. Q & A