Toward a Programmable Cloud: Foundations and Challenges

Toward a Programmable Cloud: Foundations and Challenges JOE HELLERSTEIN, UC
BERKELEY POPL Keynote, 2021

Cray-1, 1976 Supercomputers iPhone, 2007 Smart Phones Macintosh, 1984 Personal
Computers PDP-11, 1970 Minicomputers Sea Changes in Computing AWS, 2006 Public Cloud

Cray-1, 1976 Supercomputers iPhone, 2007 Smart Phones Macintosh, 1984 Personal
Computers PDP-11, 1970 Minicomputers New Platform + New Language = Innovation AWS, 2006 Public Cloud ?

The Big Query ? AWS, 2006 Public Cloud How will
people program the cloud? A grand challenge across core computer science.

5 Disorderly! Yet well-defined semantics. Parallelism and scale since the
1980s! Declarative The cloud is crying out for this! Heterogeneous, evolving, autoscaling Beautiful Growing body of work on distributed Logic Programming Asking and answering fundamental new questions! A Database/Logic Approach PODS POPL PODC

6 Prior Generation: Language Design, Top-down SIGCOMM ‘05 SIGMOD ‘06
SOSP ‘05 SenSys ‘07 EuroSys ‘10 PODS ‘10 Datalog 2.0 ‘10 SIGOPS 2010 CIDR ‘11 SOCC ‘12

7 Latest Gen: Serverless Systems, Bottom-up Cloudburst: Stateful FaaS VLDB20,
SIGMOD20, Eurosys20 Anna: Autoscaling multi-tier KVS ICDE18, VLDB19 f(x) f(x) f(x) f(x) f(x) f(x) f(x) f(x) f(x) f(x) Proving out the foundations in a traditional systems setting.

8 Time to Bring it all Together CACM Sep. 2020
CIDR 2021

Open Problems and PACT Programming Logic Foundations Declarative Programming •
Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches Today: Lessons and Foundations

Routing As Querying link(X,Y). path(X,Y) :- link(X,Y). path(X,Z) :- link(X,Y),
path(Y,Z). path(X, s)? Z X Y

Protocol Synthesis as Query Optimization 11 Finally, writing the queries
in NDlog illustrates surprising relationships between protocols. For example, we have shown that distance vector and dynamic source routing protocols differ only in a simple, traditional query optimization decision: the order in which a query’s predicates are evaluated. ” “ CACM 11/2009

DHT Overlays in Logic 12 SOSP 2005

A Hadoop Backend in Logic BOOM Analytics BOOM-FS: Rewrite of
Hadoop’s HDFS in Overlog Replacement of Hadoop Scheduler in Overlog 13 EuroSys ‘10 We began by creating an Overlog implementation of basic Paxos … Our first effort was clean and fairly simple: 22 Overlog rules in 53 lines of code, corresponding nearly line-for-line with the invariants from Lamport’s original paper [21]. ” “

What Worked Well Concise, declarative implementation with good performance 10-20x
more concise than Java (LOCs) Similar performance (within 10-20%) Separation of policy and mechanism Ease of evolution 1. High availability (failover + Paxos) 2. Scalability (hash partitioned FS master) 3. Monitoring as an aspect

What Worked Poorly Dodgy semantics As with Prolog, depended on
understanding interpreter behavior In particular, 1. change (e.g., state update) 2. uncertainty (e.g., async communication)

18 Classical Consistency Mechanisms: Coordination Consensus: Simple: Get a (fixed)
set of machines to choose the same thing (Paxos) SMR: Get a (fixed) set of machines to choose the same sequence of things (Multipaxos) Commit Protocols: Get a (fixed) set of machines to agree on transaction commit (Two-Phase Commit)

19 Coordination Avoidance the first principle of successful scalability is
to batter the consistency mechanisms down to a minimum… make it as hard as possible for application developers to get permission to use them —James Hamilton (IBM, MS, Amazon) in Birman, Chockler: “Toward a Cloud Computing Research Agenda”, LADIS 2009 ” “

20 Our own experience… We didn’t need coordination for Network
Routing! We did need it for certain DHT overlays that were “racy” We implemented it for BOOM-FS (Multipaxos)! When MUST we coordinate, when can we find something more clever? What is coordination FOR?!

Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches • Semantics: Dedalus & Bloom • Computability: The CALM Theorem Today: Lessons and Foundations

22 Dedalus and Bloom Dedalus Datalog in time and space
Bloom A Ruby-based functional DSL based on Dedalus Extended to BloomL to incorporate lattice types (more shortly) http://bloom-lang.net

23 Dedalus: It’s About Time Datalog constrained by: Logical time
as the rightmost attribute of all predicates. Restrictions on rules to track state over time “Time is what keeps everything from happening at once.” Ray Cummings, The Girl in the Golden Atom, 1922 Datalog 2010

24 Sugared Dedalus Deductive rule (immediate) Inductive rule (sequential) Async
rule (deferred, communication) p(A,B) :- q(A,B). p(A,B)@next :- q(A,B). p(A,B)@async :- q(A,B).

25 Dedalus Deductive rule (immediate) Inductive rule (sequential) Async rule
(deferred, communication) p(A,B,S) :- q(A,B,T), T=S. p(A,B,S) :- q(A,B,T), S=T+1. p(A,B,S) :- q(A,B,T), time(S), choose((A,B,T), (S)). PODS 1990

Logic and time Key relationships: Sequentiality Mutual exclusion Atomicity Datalog:
Relationships among facts Dedalus: Also, relationships between states

Dedalus: Semantics Approach: model theory Datalog famously has a unique
minimal model Which coincides with the least fixed point semantics! In Dedalus, each model is a trace 1. Infinite across time (hence minimality is tricky) 2. Non-deterministic across time, via the choice construct (hence uniqueness is tricky)

Dedalus: Semantics Dedalus: from stable models to Ultimate Models Ultimate
model: those facts that are eventually always true Programs with a unique ultimate model are confluent. Berkeley/Hasselt Tech Report, 2011 Datalog in Academia and Industry, LNCS 2012 TPLP 2015

30 Conjecture (CALM): A distributed program P has a consistent,
coordination-free distributed implementation if and only if it is monotonic. Hellerstein, PODS 2010 CALM: CONSISTENCY AS LOGICAL MONOTONICITY

31 Theorem (CALM): A distributed program P has a consistent,
coordination-free distributed implementation if and only if it is monotonic. Ameloot, Neven, Van den Bussche, PODS 2011 CALM: CONSISTENCY AS LOGICAL MONOTONICITY

32 We’ll need some formal definitions

33 Consistency: Confluent Distributed Execution Definition: A distributed program P
is consistent if it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication.

34 Consistency: Confluent Distributed Execution Definition: A distributed program P
is consistent if it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication.

35 Monotonicity Definition: A distributed program P is consistent if
it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication. Definition: A distributed program P is monotonic if for any input sets S, T if S ⊆ T, then P(S) ⊆ P(T).

36 Monotonicity Definition: A distributed program P is consistent if
it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication. Definition: A distributed program P is monotonic if for any input sets S, T if S ⊆ T, then P(S) ⊆ P(T).

37 Coordination: Data-Independent Messaging Definition: A distributed program P is
consistent if it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication. Definition: A distributed program P is monotonic if for any input sets S, T if S ⊆ T, then P(S) ⊆ P(T). Definition: A distributed program P(T) uses coordination if it requires messages to be sent under all possible input partitionings (shardings) of T.

38 Distributed Deadlock: Once you observe the existence of a
waits-for cycle, you can (autonomously) declare deadlock. More information will not change the result. Garbage Collection: Suspecting garbage (the non-existence of a path from root) is not enough; more information may change the result. Hence you are required to check all nodes for information (under any assignment of objects to nodes!) Two Canonical Examples Deadlock! Garbage?

39 CALM (Consistency As Logical Monotonicity) Theorem (CALM): A distributed
program P has a consistent, coordination-free distributed implementation if and only if it is monotonic. Hellerstein JM. The declarative imperative: Experiences and conjectures in distributed logic. ACM SIGMOD Record, Sep 2010. Ameloot TJ, Neven F, Van den Bussche J. Relational transducers for declarative networking. JACM, Apr 2013. Ameloot TJ, Ketsman B, Neven F, Zinn D. Weaker forms of monotonicity for declarative networking: a more fine-grained answer to the CALM-conjecture. ACM TODS, Feb 2016. Hellerstein JM, Alvaro P. Keeping CALM: When Distributed Consistency is Easy. CACM, September 2020.

40 Weaker forms of monotonicity Policy-aware transducers Suppose the transducer
knows the policy for sharding Some “not exists” queries become coordination-free! A hierarchy of definitions of “weaker monotonicity” under “stronger policy knowledge” Deeper connections to Datalog variants Descriptive Complexity characterizations Is there more? PODS ‘14

41 There’s More to this Story… CACM Sep. 2020 SIGMOD
Record Jun 2014

Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches • Join semilattices and CRDTs • Lattice composition Today: Lesson and Foundations • Semantics: Dedalus & Bloom • Computability: The CALM Theorem

43 My Systems Friends Forget all this logic stuff. All
I want to talk about is stuff like reordering API calls.

http://www.flickr.com/photos/25579597@N00/40955407/ CIDR 2009 Associative, Commutative Idempotent, Distributed

Associative – (X ◦ Y) ◦ Z = X ◦
(Y ◦ Z) – batch-insensitive Commutative – X ◦Y = Y ◦ X – order-insensitive Idempotent – X ◦ X = X – resend-insensitive Distributed – acronym-insensitive Associative, Commutative Idempotent, Distributed

Storing an Integer VON NEUMANN int ctr; operator:= (x) {
// assign ctr = x; } ACID 2.0 int ctr; operator<= (x) { // merge ctr = MAX(ctr, x); } DISORDERL Y INPUT STREAMS: 2, 5, 6, 7, 11, 22, 44, 91 5, 7, 2, 11, 44, 6, 22, 91, 5

Storing an Integer VON NEUMANN ACID 2.0 DISORDERED INPUT STREAMS:
2, 5, 6, 7, 11, 22, 44, 91 5, 7, 2, 11, 44, 6, 22, 91, 5 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9

48 Sounds familiar! Observation: ACID 2.0 is monotonic! Direct result
of semi-lattice properties Observation: ACID 2.0 is confluent! a.k.a “eventually consistent” without coordination Monotonicity => Consistency

49 CRDTs ACI properties: join semilattices Convergent Replicated Data Type
(CDRT) A class, with a merge function Merge is Associative, Commutative and Idempotent Implementations of many handy data structures as CRDTs

50 Problems: Scoping and Correctness Scope dilemma: Is my application
a CRDT? E.g. Google Docs. How can I be sure?

51 Desire: Composition of Simple Lattices Write simple lattices that
are easy to test How to compose lattices into larger programs? SOCC 2012 http://bloom-lang.net ICFP 2016, 2018

Monotone Functions Set (Merge = Union) Increasing Int (Merge =
Max) Boolean (Merge = Or) {a} {b} {c} {a,b} {b,c} {a,c} {a,b,c} 5 5 7 7 3 7 false false false true true true Can use monotone functions to map to other lattices! card(S) x > 6

53 Morphisms Monotone functions that distribute across Merge m(X ◦
Y) = m(X) ◦ m(Y) Why do we care? Differentiation/Flow Delta-based database query processing tricks translate naturally (e.g. semi-naïve evaluation) {1,2} {1,3} {1,4} {2,3} {2,4} {3,4} {1,2,3} {1,2,4} {1,3,4} {2,3,4} {1,2,3,4} {1} {2} {3} {4} max(S) 2 3 4 3 4 4 1 2 3 4 a morphism!

54 Anna: Mutable State Encapsulated in Lattices A coordination-free KVS
in C++ Simple lattices a la BloomL for various consistency levels Choose any HAT , we tag your data with the right lattice Anna runs without any coordination, locks or atomics Every core uses private memory Versions are gossiped/merged across cores Extremely simple, crazy fast and scalable VLDB 2013 ICDE 2019

Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches • Join semilattices and CRDTs • Lattice composition • CALM Advances • Towards a Programmable Cloud Today: Lessons and Foundations • Semantics: Dedalus & Bloom • Computability: The CALM Theorem

56 Open Problems 1: Constructive/Relaxed CALM CALM says what’s possible.
It does not help you achieve it. Constructive CALM Input: a monotonic input/output “problem” Desired output: a coordination-free implementation. Or some help designing it! Relaxed CALM Input: a non-monotonic problem specification Desired output: a “relaxed” specification that is monotonic and “similar”.

57 Open Problems 2: Stochastic CALM NeurIPS 2011 NeurIPS 2015
A supermartingale [7] is a stochastic process Wt such that E[Wt+1|Wt] ≤ Wt . That is, the expected value is non-increasing over time.

58 Open Problems 3: PTIME in the cloud? Semi-positive Datalog:
Datalog enhanced with: Negation on stored facts (only!) A successor relation (some source of order) Immerman-Vardi Theorem: Semipositive Datalog = PTIME A practical, weaker form of monotonicity corresponding to semi-positive Datalog?! What is our source of order? Is this simply coordination, or is there something weaker?

New Directions in Cloud Programming ALVIN CHEUNG, NATACHA CROOKS, JOE
HELLERSTEIN & MATTHEW MILANO UC BERKELEY

A New PACT for Cloud Programming P Program Semantics A
Availability C Consistency T Targets for Optimization A declaration of four independent facets Functionality modulo distribution Tolerate f independent failures Application-specific guarantees per API Multi-Objective Optimization

¡Viva La Evolución!

HYDRO: A PACT Programming Stack … Cloud Services … FaaS
Storage ML Frameworks Actors (e.g. Orleans) Functional (e.g. Spark) Logic (e.g. Bloom) Futures (e.g. Ray) P Program Semantics A Availability New DSLs HYDROLOGIC HYDRAULIC Verified Lifting HYDROLYSIS Compiler HYDROFLOW Deployment HYDROFLOW Program C Consistency Targets for Optimization T Sequential Code HYDRO Polyglot programming Declarative PACT IR Data/Lattice/Eventflow IR Adaptive self-deploying exe’s

Many wonderful friends and colleagues

Many wonderful friends and colleagues Key PhD Dissertations

Joe Hellerstein [email protected] @joe_hellerstein 6 More Information CALM CACM: https://bit.ly/calmcacm
Hydro Paper: http://bit.ly/hydroCIDR21 Hydro: https://hydro-project.github.io/ Blog Post: https://bit.ly/stateofserverlessart DSF@Berkeley: https://dsf.berkeley.edu

66 Backup Slides

State Update Via Frame Rules p(A, B)@next :- p(A, B),
¬p_del(A, B). Example Trace: p(1, 2)@101; p(1, 3)@102; p_del(1, 3)@300; Time p(1, 2) p(1, 3) p_del(1, 3) 101 102 ... 300 301

68 The CALM Theorem Went Further! An oblivious transducer model
does not access the following information: All (“membership”: a table of all nodes in the system) Id (“self-awareness”: the name of the local node) For queries Q in language L, the following are all equivalent! Q can be computed by a coordination-free L-transducer Q is monotonic Q can be computed by an oblivious L-transducer Practical implications of (non-)obliviousness for highly dynamic systems! E.g. serverless computing, edge/fog computing, p2p

69 Related Work Koutris & Suciu. “Parallel Evaluation of Conjunctive
Queries.” PODS 2011 Beame, Koutris & Suciu. “Communication Steps for Parallel Query Processing.” PODS 213 Beame, Koutris & Suciu. “Skew in Parallel Query Processing”. PODS 2014 Koutris, Beame & Suciu. “Worst-Case Optimal Algorithms for Parallel Query Processing”. ICDT 2016

70 Towards a Solution Systems work is all about I/O
& memory models. What if we reason about application semantics? With thanks to Peter Bailis…

71 Logic and Lattice Composition Bloom: a “disorderly” programming language
based on logic and lattices Monotone composition of “lego blocks” lattices into bigger programs. Allows non-monotonic expressions but they require “extra” syntax. Syntactic CALM analysis Alvaro P, Conway N, Hellerstein JM, Marczak WR. Consistency Analysis in Bloom: a CALM and Collected Approach. CIDR 2011. Conway N, Marczak WR, Alvaro P, Hellerstein JM, Maier D. Logic and lattices for distributed programming. ACM SoCC, 2012.

CloudBurst/Hydrocache: A Stateful Serverless Platform Competitive performance for a prediction
serving pipeline. 200 400 600 800 1000 1200 1400 Python Droplet AWS Lambda (Mock) AWS SageMaker AWS Lambda (Actual) Latency (ms) 182.5 210.2 325.7 355.8 1181 191.5 277.4 411.3 416.6 1364 Performant consistency on a real-world web app. 1 10 100 1000 Droplet (LWW) Droplet (Causal) Redis Droplet (LWW) Droplet (Causal) Redis Reads Writes Latency (ms) 16.1 18.0 15.0 397 501 810 31.9 79 27.9 503 801 921.3 1 10 100 1000 10000 D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) Size: 80KB Size: 800KB Size: 8MB Size: 80MB Latency (ms) 2.8 5.6 32.7 346 4.7 21.1 100 1065 3.2 9.3 38.3 385 6.7 66.9 112 1630 6.4 59.8 253 506 17.2 279 392 2034 81.6 732 2646 1963 238 2743 5209 4250 End-to-end latency for a task with large data inputs.

Intuition: Streaming Queries are Monotonic 𝜎 p 𝜋 f… ⋈
p SELECT * FROM input WHERE … SELECT f(…), … FROM input SELECT * FROM i1 INNER JOIN i2 ON …

Intuition: Blocking Queries are Non-Monotonic 𝛤 k,g 𝛤 k,b(g) SELECT
key, COUNT(*) FROM input GROUP BY key SELECT key FROM input HAVING COUNT(*) > 10

Intuition: Blocking Queries are Non-Monotonic 𝛤 k,g 𝛤 k,b(g) SELECT
key, COUNT(*) FROM input GROUP BY key SELECT key FROM input HAVING COUNT(*) > 10 Except in special cases! We’ll come back to this…

76 Anna Serverless KVS • Anyscale: perform like Redis, scale
like S3 • CALM consistency levels via simple lattices • Autoscaling & multitier serverless storage • Won best-of-conference at ICDE, VLDB1, 2 1 Wu, Chenggang, et al. "Anna: A kvs for any scale." IEEE Transactions on Knowledge and Data Engineering (2019). 2 Wu, Chenggang, Vikram Sreekanti, and Joseph M. Hellerstein. "Autoscaling tiered cloud storage in Anna." PVLDB 12.6 (2019): 624-638.

77 Anna Performance Shared-nothing at all scales (even across threads)
Crazy fast under contention Up to 700x faster than Masstree within a multicore machine Up to 10x faster than Cassandra in a geo-distributed deployment Coordination-free consistency. No atomics, no locks, no waiting ever! 700x!

78 CALM Consistency Simple, clean lattice composition gives range of
consistency levels Lines of C++ code modified by system component KEEP CALM AND WRITE(X)

79 Autoscaling & Multi-Tier Cost Tradeoffs 350x the performance of
DynamoDB for the same price!

Cloudburst: A Stateful Serverless Platform Main Challenge: Cache consistency! Hydrocache:
new consistency protocols for distributed client “sessions” Compute Storage

81 Multiple Consistency Levels Here Too Read Atomic transactions AFT1:
a fault tolerance shim layer between any FaaS and any object store • Currently evaluated between AWS Lambda and AWS S3! Multisite Transactional Causal Consistency (MTCC)2 Causal: Preserve Lamport’s happened before relation Multisite transactional: Nested functions running across multiple machines. 81 1Sreekanti, Vikram, et al. A Fault-Tolerance Shim for Serverless Computing. To appear, Eurosys (2020). 2Wu, Chenggang, et al. Transactional Causal Consistency for Serverless Computing. To appear, ACM SIGMOD (2020).

Running a Twitter Clone on Cloudburst 1 10 100 1000
Cloudburst (LWW) Cloudburst (Causal) Redis Cloudburst (LWW) Cloudburst (Causal) Redis Reads Writes Latency (ms) 16.1 18.0 15.0 397 501 810 31.9 79 27.9 503 801 921.3

Prediction Serving on Cloudburst 200 400 600 800 1000 1200
1400 Python Cloudburst AWS SageMaker AWS Lambda Latency (ms) 182.5 210.2 355.8 1181 191.5 277.4 416.6 1364

Toward a Programmable Cloud: Foundations and C...

Toward a Programmable Cloud: Foundations and Challenges

More Decks by Joe Hellerstein

Other Decks in Technology

Featured

Transcript