Service Backplanes for the Modern Data Center

Slide 1

Slide 1 text

Slide 2

Slide 2 text

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

© 2016 Mesosphere, Inc. All Rights Reserved. 8 Resource Allocation Via Configuration n1 n4 n2 n5 n3 n6 n7 n8 n9 Config: ● N1, N4, N7 ➝ Hadoop ● N2, N5, N8 ➝ Postgres ● N3, N6, N9 ➝ NGINX n1 n4 n2 n5 n3 n6 n7 n8 n9 Manual, static configuration Applications take resource allocation as an input

Slide 9

Slide 9 text

© 2016 Mesosphere, Inc. All Rights Reserved. 9 Analogy: Manual Memory Management 0x0 0x8 Config: ● [0x0,0x1) ➝ calc.exe ● [0x1,0x4) ➝ winmine.exe ● [0x4,0x8) ➝ notepad.exe Physical Memory 0x0 0x8 Applications take physical memory address range as an input

Slide 10

Slide 10 text

© 2016 Mesosphere, Inc. All Rights Reserved. 10 Consequences Utilization Low ❌ Deployment Agility Low ❌ Elasticity None ❌ Test / Dev / Staging Envs Difficult ❌ Simplicity High ✅ … but it basically worked, and it was simple.

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

© 2016 Mesosphere, Inc. All Rights Reserved. 14 Architecture ● Allow unmodified application software to run at scale ● Interface between application instances and provisioning APIs Service Backplanes Cassandra Backplane n1 n3 n2 n4 Postgres Backplane

Slide 15

Slide 15 text

© 2016 Mesosphere, Inc. All Rights Reserved. 15 Resource Management ● Allocate resources to apps ○ Fairness, utilization, etc. ● Elasticity and auto-scaling ● Oversubscription, perf isolation ● Abstractions for complex resources (e.g., GPUs) Key Backplane Functionality Lifecycle Management ● Replace failed instances ○ Migrate state/data as needed ● Allow machines, racks to be replaced (safely!) ● Allow apps to be upgraded (safely!) Resource Management Lifecycle Management Backplane: interface between application and “cluster context”

Slide 16

Slide 16 text

© 2016 Mesosphere, Inc. All Rights Reserved. 16 Upgrading 3-10 Cassandra nodes: annoying but manageable. Upgrading 25k Cassandra nodes: really hard problem. Example: Upgrades at Scale Challenges: ● Roll-backs, non- destructive upgrades ● Deploy upgrade to subset of cluster ● Move traffic away to avoid downtime ● Data migration Hard to solve “inside” the app

Slide 17

Slide 17 text

© 2016 Mesosphere, Inc. All Rights Reserved. 17 ● Scheduling is important ● But: much more to backplanes than bin-packing or max-min fairness ● Requires deep knowledge of ○ Application semantics ○ Ops procedures ● Goal: transform prepackaged “server software” into “service” Not (Just) “Scheduling” or “Container Orchestration” “... there are not very many things that have aged as well as the [Linux] scheduler. Which is just another proof that scheduling is easy.” —Linus Torvalds, 2001

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

© 2016 Mesosphere, Inc. All Rights Reserved. 20 Goal Provide a software service to the rest of the organization E.g., object storage, streaming data analysis, batch analytics, ML, etc. Common Pattern Solution ● Start with off-the-shelf (OSS) software package ● Write “scripts” to deploy, manage, and upgrade instances

Slide 21

Slide 21 text

© 2016 Mesosphere, Inc. All Rights Reserved. 21 Building fault-tolerant control planes for cluster services is not easy! Problem #1: Backplanes Are Hard ● Often >10,000s LOCs ● Hard to test and debug ● Maintenance burden Backplane downtime is service downtime

Slide 22

Slide 22 text

© 2016 Mesosphere, Inc. All Rights Reserved. 22 ● In many cases, the service is the “product” ● Backplane is just a “bunch of scripts” ○ Not a distinct component of the system architecture ● Sometimes built in an ad-hoc way ● Often no rigorous specification or API Problem #2: Not Seen As A Product

Slide 23

Slide 23 text

© 2016 Mesosphere, Inc. All Rights Reserved. 23 ● Many backplanes are similar ● Typically built by different teams that don’t collaborate ○ No opportunity for code reuse ○ No shared infrastructure ● Each backplane cannot examine global cluster state ● Hard to define global policies that apply to all backplanes Problem #3: Redundancy Between Services

Slide 24

Slide 24 text

© 2016 Mesosphere, Inc. All Rights Reserved. 24 ● Many organizations have custom- written backplanes for Cassandra, Kafka, HDFS, etc. ● Often tightly coupled to their production environment ○ Result: fragile, not portable to other environments Problem #4: Redundancy Between Organizations

Slide 25

Slide 25 text

© 2016 Mesosphere, Inc. All Rights Reserved. 25 Developer “ships” a release of their software package ● Then >10k LOC is needed to deploy it at scale! This sucks ● The upstream developer is the domain expert ● Developer ships code their customer can’t (directly) use The Gap From “Done” to “Deployable” Can we standardize the functionality needed for large- scale deployments? ● Allow backplane functionality to move “up” the stack ● Tested and developed as part of the upstream software

Slide 26

Slide 26 text

© 2016 Mesosphere, Inc. All Rights Reserved. 26 1. Deploy to prod and pray 2. Document best practices (“runbook”) 3. Write scripts to handle common scenarios 4. Encode best practices as a service backplane Opportunity: Shrink Runbooks

Slide 27

Slide 27 text

Slide 28

Slide 28 text

© 2016 Mesosphere, Inc. All Rights Reserved. 28 1. Embrace backplanes as a standard component in large- scale distributed systems ● Not just “a few scripts” 2. Build infrastructure to make writing backplanes easier 3. Define standard APIs for communicating between backplanes and cluster infrastructure 4. Enable upstream software developers to ship backplanes as part of their software packages Rethinking Service Backplanes

Slide 29

Slide 29 text

© 2016 Mesosphere, Inc. All Rights Reserved. 29 Example Architecture Backplane Manager Cassandra Backplane Postgres Backplane Abstract away details of cloud or on-prem env. Clear API / interface for service backplanes Cluster Operator Single operator interface, define global policy

Slide 30

Slide 30 text

© 2016 Mesosphere, Inc. All Rights Reserved. 30 ● “Manage your data center as a single pool of resources.” ● UC Berkeley: 2008 ● Battle-tested at Twitter: 2009-2016 ● Other users: Apple, eBay, Netflix, Microsoft, PayPal, AirBnb, Criteo, Yelp, Uber, ... Background: Apache Mesos Mesos Master Scheduler X Mesos Agent Task Executor Scheduler Y Machine M “I have 8 CPUs, 8 disks, 64GB RAM” “Offer: would you like 8 CPUs, 8 disks, and 64GB of RAM?” “Accept: Launch container X.” “Launch container X.”

Slide 31

Slide 31 text

Slide 32

Slide 32 text

Slide 33

Slide 33 text

© 2016 Mesosphere, Inc. All Rights Reserved. 33 ● Backplane ↔ backplane manager ● Application ↔ backplane ● Dimensions: ○ Push or pull (offer vs. request) ○ Optimistic or pessimistic ○ Declarative or imperative ○ Narrow or wide ● How to represent cluster resources? Open Question: APIs

Slide 34

Slide 34 text

© 2016 Mesosphere, Inc. All Rights Reserved. 34 ● Where does the functionality live? ○ Application, backplane, or backplane manager ● Does this change how we should build common service features? ○ Security? Logging? Metrics? Fault tolerance? Service discovery? Data migration? Open Question: Co-Design of Applications and Backplanes

Slide 35

Slide 35 text

© 2016 Mesosphere, Inc. All Rights Reserved. 35 1. Many people are building service backplanes, even if they don’t call them that 2. Driven by industry forces that are likely to persist 3. We should embrace the need for backplanes and figure out how to build them properly Conclusion