Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Toward a Programmable Cloud: Foundations and C...

Toward a Programmable Cloud: Foundations and Challenges

Keynote talk at ACM POPL 2021.

Joe Hellerstein

April 07, 2022
Tweet

More Decks by Joe Hellerstein

Other Decks in Technology

Transcript

  1. Cray-1, 1976 Supercomputers iPhone, 2007 Smart Phones Macintosh, 1984 Personal

    Computers PDP-11, 1970 Minicomputers Sea Changes in Computing AWS, 2006 Public Cloud
  2. Cray-1, 1976 Supercomputers iPhone, 2007 Smart Phones Macintosh, 1984 Personal

    Computers PDP-11, 1970 Minicomputers New Platform + New Language = Innovation AWS, 2006 Public Cloud ?
  3. The Big Query ? AWS, 2006 Public Cloud How will

    people program the cloud? A grand challenge across core computer science.
  4. 5 Disorderly! Yet well-defined semantics. Parallelism and scale since the

    1980s! Declarative The cloud is crying out for this! Heterogeneous, evolving, autoscaling Beautiful Growing body of work on distributed Logic Programming Asking and answering fundamental new questions! A Database/Logic Approach PODS POPL PODC
  5. 6 Prior Generation: Language Design, Top-down SIGCOMM ‘05 SIGMOD ‘06

    SOSP ‘05 SenSys ‘07 EuroSys ‘10 PODS ‘10 Datalog 2.0 ‘10 SIGOPS 2010 CIDR ‘11 SOCC ‘12
  6. 7 Latest Gen: Serverless Systems, Bottom-up Cloudburst: Stateful FaaS VLDB20,

    SIGMOD20, Eurosys20 Anna: Autoscaling multi-tier KVS ICDE18, VLDB19 f(x) f(x) f(x) f(x) f(x) f(x) f(x) f(x) f(x) f(x) Proving out the foundations in a traditional systems setting.
  7. Open Problems and PACT Programming Logic Foundations Declarative Programming •

    Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches Today: Lessons and Foundations
  8. Protocol Synthesis as Query Optimization 11 Finally, writing the queries

    in NDlog illustrates surprising relationships between protocols. For example, we have shown that distance vector and dynamic source routing protocols differ only in a simple, traditional query optimization decision: the order in which a query’s predicates are evaluated. ” “ CACM 11/2009
  9. A Hadoop Backend in Logic BOOM Analytics BOOM-FS: Rewrite of

    Hadoop’s HDFS in Overlog Replacement of Hadoop Scheduler in Overlog 13 EuroSys ‘10 We began by creating an Overlog implementation of basic Paxos … Our first effort was clean and fairly simple: 22 Overlog rules in 53 lines of code, corresponding nearly line-for-line with the invariants from Lamport’s original paper [21]. ” “
  10. What Worked Well Concise, declarative implementation with good performance 10-20x

    more concise than Java (LOCs) Similar performance (within 10-20%) Separation of policy and mechanism Ease of evolution 1. High availability (failover + Paxos) 2. Scalability (hash partitioned FS master) 3. Monitoring as an aspect
  11. What Worked Poorly Dodgy semantics As with Prolog, depended on

    understanding interpreter behavior In particular, 1. change (e.g., state update) 2. uncertainty (e.g., async communication)
  12. 18 Classical Consistency Mechanisms: Coordination Consensus: Simple: Get a (fixed)

    set of machines to choose the same thing (Paxos) SMR: Get a (fixed) set of machines to choose the same sequence of things (Multipaxos) Commit Protocols: Get a (fixed) set of machines to agree on transaction commit (Two-Phase Commit)
  13. 19 Coordination Avoidance the first principle of successful scalability is

    to batter the consistency mechanisms down to a minimum… make it as hard as possible for application developers to get permission to use them —James Hamilton (IBM, MS, Amazon) in Birman, Chockler: “Toward a Cloud Computing Research Agenda”, LADIS 2009 ” “
  14. 20 Our own experience… We didn’t need coordination for Network

    Routing! We did need it for certain DHT overlays that were “racy” We implemented it for BOOM-FS (Multipaxos)! When MUST we coordinate, when can we find something more clever? What is coordination FOR?!
  15. Open Problems and PACT Programming Logic Foundations Declarative Programming •

    Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches • Semantics: Dedalus & Bloom • Computability: The CALM Theorem Today: Lessons and Foundations
  16. 22 Dedalus and Bloom Dedalus Datalog in time and space

    Bloom A Ruby-based functional DSL based on Dedalus Extended to BloomL to incorporate lattice types (more shortly) http://bloom-lang.net
  17. 23 Dedalus: It’s About Time Datalog constrained by: Logical time

    as the rightmost attribute of all predicates. Restrictions on rules to track state over time “Time is what keeps everything from happening at once.” Ray Cummings, The Girl in the Golden Atom, 1922 Datalog 2010
  18. 24 Sugared Dedalus Deductive rule (immediate) Inductive rule (sequential) Async

    rule (deferred, communication) p(A,B) :- q(A,B). p(A,B)@next :- q(A,B). p(A,B)@async :- q(A,B).
  19. 25 Dedalus Deductive rule (immediate) Inductive rule (sequential) Async rule

    (deferred, communication) p(A,B,S) :- q(A,B,T), T=S. p(A,B,S) :- q(A,B,T), S=T+1. p(A,B,S) :- q(A,B,T), time(S), choose((A,B,T), (S)). PODS 1990
  20. Logic and time Key relationships: Sequentiality Mutual exclusion Atomicity Datalog:

    Relationships among facts Dedalus: Also, relationships between states
  21. Dedalus: Semantics Approach: model theory Datalog famously has a unique

    minimal model Which coincides with the least fixed point semantics! In Dedalus, each model is a trace 1. Infinite across time (hence minimality is tricky) 2. Non-deterministic across time, via the choice construct (hence uniqueness is tricky)
  22. Dedalus: Semantics Dedalus: from stable models to Ultimate Models Ultimate

    model: those facts that are eventually always true Programs with a unique ultimate model are confluent. Berkeley/Hasselt Tech Report, 2011 Datalog in Academia and Industry, LNCS 2012 TPLP 2015
  23. 30 Conjecture (CALM): A distributed program P has a consistent,

    coordination-free distributed implementation if and only if it is monotonic. Hellerstein, PODS 2010 CALM: CONSISTENCY AS LOGICAL MONOTONICITY
  24. 31 Theorem (CALM): A distributed program P has a consistent,

    coordination-free distributed implementation if and only if it is monotonic. Ameloot, Neven, Van den Bussche, PODS 2011 CALM: CONSISTENCY AS LOGICAL MONOTONICITY
  25. 33 Consistency: Confluent Distributed Execution Definition: A distributed program P

    is consistent if it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication.
  26. 34 Consistency: Confluent Distributed Execution Definition: A distributed program P

    is consistent if it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication.
  27. 35 Monotonicity Definition: A distributed program P is consistent if

    it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication. Definition: A distributed program P is monotonic if for any input sets S, T if S ⊆ T, then P(S) ⊆ P(T).
  28. 36 Monotonicity Definition: A distributed program P is consistent if

    it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication. Definition: A distributed program P is monotonic if for any input sets S, T if S ⊆ T, then P(S) ⊆ P(T).
  29. 37 Coordination: Data-Independent Messaging Definition: A distributed program P is

    consistent if it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication. Definition: A distributed program P is monotonic if for any input sets S, T if S ⊆ T, then P(S) ⊆ P(T). Definition: A distributed program P(T) uses coordination if it requires messages to be sent under all possible input partitionings (shardings) of T.
  30. 38 Distributed Deadlock: Once you observe the existence of a

    waits-for cycle, you can (autonomously) declare deadlock. More information will not change the result. Garbage Collection: Suspecting garbage (the non-existence of a path from root) is not enough; more information may change the result. Hence you are required to check all nodes for information (under any assignment of objects to nodes!) Two Canonical Examples Deadlock! Garbage?
  31. 39 CALM (Consistency As Logical Monotonicity) Theorem (CALM): A distributed

    program P has a consistent, coordination-free distributed implementation if and only if it is monotonic. Hellerstein JM. The declarative imperative: Experiences and conjectures in distributed logic. ACM SIGMOD Record, Sep 2010. Ameloot TJ, Neven F, Van den Bussche J. Relational transducers for declarative networking. JACM, Apr 2013. Ameloot TJ, Ketsman B, Neven F, Zinn D. Weaker forms of monotonicity for declarative networking: a more fine-grained answer to the CALM-conjecture. ACM TODS, Feb 2016. Hellerstein JM, Alvaro P. Keeping CALM: When Distributed Consistency is Easy. CACM, September 2020.
  32. 40 Weaker forms of monotonicity Policy-aware transducers Suppose the transducer

    knows the policy for sharding Some “not exists” queries become coordination-free! A hierarchy of definitions of “weaker monotonicity” under “stronger policy knowledge” Deeper connections to Datalog variants Descriptive Complexity characterizations Is there more? PODS ‘14
  33. Open Problems and PACT Programming Logic Foundations Declarative Programming •

    Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches • Join semilattices and CRDTs • Lattice composition Today: Lesson and Foundations • Semantics: Dedalus & Bloom • Computability: The CALM Theorem
  34. 43 My Systems Friends Forget all this logic stuff. All

    I want to talk about is stuff like reordering API calls.
  35. Associative – (X ◦ Y) ◦ Z = X ◦

    (Y ◦ Z) – batch-insensitive Commutative – X ◦Y = Y ◦ X – order-insensitive Idempotent – X ◦ X = X – resend-insensitive Distributed – acronym-insensitive Associative, Commutative Idempotent, Distributed
  36. Storing an Integer VON NEUMANN int ctr; operator:= (x) {

    // assign ctr = x; } ACID 2.0 int ctr; operator<= (x) { // merge ctr = MAX(ctr, x); } DISORDERL Y INPUT STREAMS: 2, 5, 6, 7, 11, 22, 44, 91 5, 7, 2, 11, 44, 6, 22, 91, 5
  37. Storing an Integer VON NEUMANN ACID 2.0 DISORDERED INPUT STREAMS:

    2, 5, 6, 7, 11, 22, 44, 91 5, 7, 2, 11, 44, 6, 22, 91, 5 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9
  38. 48 Sounds familiar! Observation: ACID 2.0 is monotonic! Direct result

    of semi-lattice properties Observation: ACID 2.0 is confluent! a.k.a “eventually consistent” without coordination Monotonicity => Consistency
  39. 49 CRDTs ACI properties: join semilattices Convergent Replicated Data Type

    (CDRT) A class, with a merge function Merge is Associative, Commutative and Idempotent Implementations of many handy data structures as CRDTs
  40. 50 Problems: Scoping and Correctness Scope dilemma: Is my application

    a CRDT? E.g. Google Docs. How can I be sure?
  41. 51 Desire: Composition of Simple Lattices Write simple lattices that

    are easy to test How to compose lattices into larger programs? SOCC 2012 http://bloom-lang.net ICFP 2016, 2018
  42. Monotone Functions Set (Merge = Union) Increasing Int (Merge =

    Max) Boolean (Merge = Or) {a} {b} {c} {a,b} {b,c} {a,c} {a,b,c} 5 5 7 7 3 7 false false false true true true Can use monotone functions to map to other lattices! card(S) x > 6
  43. 53 Morphisms Monotone functions that distribute across Merge m(X ◦

    Y) = m(X) ◦ m(Y) Why do we care? Differentiation/Flow Delta-based database query processing tricks translate naturally (e.g. semi-naïve evaluation) {1,2} {1,3} {1,4} {2,3} {2,4} {3,4} {1,2,3} {1,2,4} {1,3,4} {2,3,4} {1,2,3,4} {1} {2} {3} {4} max(S) 2 3 4 3 4 4 1 2 3 4 a morphism!
  44. 54 Anna: Mutable State Encapsulated in Lattices A coordination-free KVS

    in C++ Simple lattices a la BloomL for various consistency levels Choose any HAT , we tag your data with the right lattice Anna runs without any coordination, locks or atomics Every core uses private memory Versions are gossiped/merged across cores Extremely simple, crazy fast and scalable VLDB 2013 ICDE 2019
  45. Open Problems and PACT Programming Logic Foundations Declarative Programming •

    Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches • Join semilattices and CRDTs • Lattice composition • CALM Advances • Towards a Programmable Cloud Today: Lessons and Foundations • Semantics: Dedalus & Bloom • Computability: The CALM Theorem
  46. 56 Open Problems 1: Constructive/Relaxed CALM CALM says what’s possible.

    It does not help you achieve it. Constructive CALM Input: a monotonic input/output “problem” Desired output: a coordination-free implementation. Or some help designing it! Relaxed CALM Input: a non-monotonic problem specification Desired output: a “relaxed” specification that is monotonic and “similar”.
  47. 57 Open Problems 2: Stochastic CALM NeurIPS 2011 NeurIPS 2015

    A supermartingale [7] is a stochastic process Wt such that E[Wt+1|Wt] ≤ Wt . That is, the expected value is non-increasing over time.
  48. 58 Open Problems 3: PTIME in the cloud? Semi-positive Datalog:

    Datalog enhanced with: Negation on stored facts (only!) A successor relation (some source of order) Immerman-Vardi Theorem: Semipositive Datalog = PTIME A practical, weaker form of monotonicity corresponding to semi-positive Datalog?! What is our source of order? Is this simply coordination, or is there something weaker?
  49. A New PACT for Cloud Programming P Program Semantics A

    Availability C Consistency T Targets for Optimization A declaration of four independent facets Functionality modulo distribution Tolerate f independent failures Application-specific guarantees per API Multi-Objective Optimization
  50. HYDRO: A PACT Programming Stack … Cloud Services … FaaS

    Storage ML Frameworks Actors (e.g. Orleans) Functional (e.g. Spark) Logic (e.g. Bloom) Futures (e.g. Ray) P Program Semantics A Availability New DSLs HYDROLOGIC HYDRAULIC Verified Lifting HYDROLYSIS Compiler HYDROFLOW Deployment HYDROFLOW Program C Consistency Targets for Optimization T Sequential Code HYDRO Polyglot programming Declarative PACT IR Data/Lattice/Eventflow IR Adaptive self-deploying exe’s
  51. Joe Hellerstein [email protected] @joe_hellerstein 6 More Information CALM CACM: https://bit.ly/calmcacm

    Hydro Paper: http://bit.ly/hydroCIDR21 Hydro: https://hydro-project.github.io/ Blog Post: https://bit.ly/stateofserverlessart DSF@Berkeley: https://dsf.berkeley.edu
  52. State Update Via Frame Rules p(A, B)@next :- p(A, B),

    ¬p_del(A, B). Example Trace: p(1, 2)@101; p(1, 3)@102; p_del(1, 3)@300; Time p(1, 2) p(1, 3) p_del(1, 3) 101 102 ... 300 301
  53. 68 The CALM Theorem Went Further! An oblivious transducer model

    does not access the following information: All (“membership”: a table of all nodes in the system) Id (“self-awareness”: the name of the local node) For queries Q in language L, the following are all equivalent! Q can be computed by a coordination-free L-transducer Q is monotonic Q can be computed by an oblivious L-transducer Practical implications of (non-)obliviousness for highly dynamic systems! E.g. serverless computing, edge/fog computing, p2p
  54. 69 Related Work Koutris & Suciu. “Parallel Evaluation of Conjunctive

    Queries.” PODS 2011 Beame, Koutris & Suciu. “Communication Steps for Parallel Query Processing.” PODS 213 Beame, Koutris & Suciu. “Skew in Parallel Query Processing”. PODS 2014 Koutris, Beame & Suciu. “Worst-Case Optimal Algorithms for Parallel Query Processing”. ICDT 2016
  55. 70 Towards a Solution Systems work is all about I/O

    & memory models. What if we reason about application semantics? With thanks to Peter Bailis…
  56. 71 Logic and Lattice Composition Bloom: a “disorderly” programming language

    based on logic and lattices Monotone composition of “lego blocks” lattices into bigger programs. Allows non-monotonic expressions but they require “extra” syntax. Syntactic CALM analysis Alvaro P, Conway N, Hellerstein JM, Marczak WR. Consistency Analysis in Bloom: a CALM and Collected Approach. CIDR 2011. Conway N, Marczak WR, Alvaro P, Hellerstein JM, Maier D. Logic and lattices for distributed programming. ACM SoCC, 2012.
  57. CloudBurst/Hydrocache: A Stateful Serverless Platform Competitive performance for a prediction

    serving pipeline. 200 400 600 800 1000 1200 1400 Python Droplet AWS Lambda (Mock) AWS SageMaker AWS Lambda (Actual) Latency (ms) 182.5 210.2 325.7 355.8 1181 191.5 277.4 411.3 416.6 1364 Performant consistency on a real-world web app. 1 10 100 1000 Droplet (LWW) Droplet (Causal) Redis Droplet (LWW) Droplet (Causal) Redis Reads Writes Latency (ms) 16.1 18.0 15.0 397 501 810 31.9 79 27.9 503 801 921.3 1 10 100 1000 10000 D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) Size: 80KB Size: 800KB Size: 8MB Size: 80MB Latency (ms) 2.8 5.6 32.7 346 4.7 21.1 100 1065 3.2 9.3 38.3 385 6.7 66.9 112 1630 6.4 59.8 253 506 17.2 279 392 2034 81.6 732 2646 1963 238 2743 5209 4250 End-to-end latency for a task with large data inputs.
  58. Intuition: Streaming Queries are Monotonic 𝜎 p 𝜋 f… ⋈

    p SELECT * FROM input WHERE … SELECT f(…), … FROM input SELECT * FROM i1 INNER JOIN i2 ON …
  59. Intuition: Blocking Queries are Non-Monotonic 𝛤 k,g 𝛤 k,b(g) SELECT

    key, COUNT(*) FROM input GROUP BY key SELECT key FROM input HAVING COUNT(*) > 10
  60. Intuition: Blocking Queries are Non-Monotonic 𝛤 k,g 𝛤 k,b(g) SELECT

    key, COUNT(*) FROM input GROUP BY key SELECT key FROM input HAVING COUNT(*) > 10 Except in special cases! We’ll come back to this…
  61. 76 Anna Serverless KVS • Anyscale: perform like Redis, scale

    like S3 • CALM consistency levels via simple lattices • Autoscaling & multitier serverless storage • Won best-of-conference at ICDE, VLDB1, 2 1 Wu, Chenggang, et al. "Anna: A kvs for any scale." IEEE Transactions on Knowledge and Data Engineering (2019). 2 Wu, Chenggang, Vikram Sreekanti, and Joseph M. Hellerstein. "Autoscaling tiered cloud storage in Anna." PVLDB 12.6 (2019): 624-638.
  62. 77 Anna Performance Shared-nothing at all scales (even across threads)

    Crazy fast under contention Up to 700x faster than Masstree within a multicore machine Up to 10x faster than Cassandra in a geo-distributed deployment Coordination-free consistency. No atomics, no locks, no waiting ever! 700x!
  63. 78 CALM Consistency Simple, clean lattice composition gives range of

    consistency levels Lines of C++ code modified by system component KEEP CALM AND WRITE(X)
  64. Cloudburst: A Stateful Serverless Platform Main Challenge: Cache consistency! Hydrocache:

    new consistency protocols for distributed client “sessions” Compute Storage
  65. 81 Multiple Consistency Levels Here Too Read Atomic transactions AFT1:

    a fault tolerance shim layer between any FaaS and any object store • Currently evaluated between AWS Lambda and AWS S3! Multisite Transactional Causal Consistency (MTCC)2 Causal: Preserve Lamport’s happened before relation Multisite transactional: Nested functions running across multiple machines. 81 1Sreekanti, Vikram, et al. A Fault-Tolerance Shim for Serverless Computing. To appear, Eurosys (2020). 2Wu, Chenggang, et al. Transactional Causal Consistency for Serverless Computing. To appear, ACM SIGMOD (2020).
  66. Running a Twitter Clone on Cloudburst 1 10 100 1000

    Cloudburst (LWW) Cloudburst (Causal) Redis Cloudburst (LWW) Cloudburst (Causal) Redis Reads Writes Latency (ms) 16.1 18.0 15.0 397 501 810 31.9 79 27.9 503 801 921.3
  67. Prediction Serving on Cloudburst 200 400 600 800 1000 1200

    1400 Python Cloudburst AWS SageMaker AWS Lambda Latency (ms) 182.5 210.2 355.8 1181 191.5 277.4 416.6 1364