$30 off During Our Annual Pro Sale. View details »

Toward a Programmable Cloud: Foundations and Challenges

Toward a Programmable Cloud: Foundations and Challenges

Keynote talk at ACM POPL 2021.

Joe Hellerstein

April 07, 2022
Tweet

More Decks by Joe Hellerstein

Other Decks in Technology

Transcript

  1. Toward a Programmable Cloud: Foundations and Challenges JOE HELLERSTEIN, UC

    BERKELEY POPL Keynote, 2021
  2. Cray-1, 1976 Supercomputers iPhone, 2007 Smart Phones Macintosh, 1984 Personal

    Computers PDP-11, 1970 Minicomputers Sea Changes in Computing AWS, 2006 Public Cloud
  3. Cray-1, 1976 Supercomputers iPhone, 2007 Smart Phones Macintosh, 1984 Personal

    Computers PDP-11, 1970 Minicomputers New Platform + New Language = Innovation AWS, 2006 Public Cloud ?
  4. The Big Query ? AWS, 2006 Public Cloud How will

    people program the cloud? A grand challenge across core computer science.
  5. 5 Disorderly! Yet well-defined semantics. Parallelism and scale since the

    1980s! Declarative The cloud is crying out for this! Heterogeneous, evolving, autoscaling Beautiful Growing body of work on distributed Logic Programming Asking and answering fundamental new questions! A Database/Logic Approach PODS POPL PODC
  6. 6 Prior Generation: Language Design, Top-down SIGCOMM ‘05 SIGMOD ‘06

    SOSP ‘05 SenSys ‘07 EuroSys ‘10 PODS ‘10 Datalog 2.0 ‘10 SIGOPS 2010 CIDR ‘11 SOCC ‘12
  7. 7 Latest Gen: Serverless Systems, Bottom-up Cloudburst: Stateful FaaS VLDB20,

    SIGMOD20, Eurosys20 Anna: Autoscaling multi-tier KVS ICDE18, VLDB19 f(x) f(x) f(x) f(x) f(x) f(x) f(x) f(x) f(x) f(x) Proving out the foundations in a traditional systems setting.
  8. 8 Time to Bring it all Together CACM Sep. 2020

    CIDR 2021
  9. Open Problems and PACT Programming Logic Foundations Declarative Programming •

    Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches Today: Lessons and Foundations
  10. Routing As Querying link(X,Y). path(X,Y) :- link(X,Y). path(X,Z) :- link(X,Y),

    path(Y,Z). path(X, s)? Z X Y
  11. Protocol Synthesis as Query Optimization 11 Finally, writing the queries

    in NDlog illustrates surprising relationships between protocols. For example, we have shown that distance vector and dynamic source routing protocols differ only in a simple, traditional query optimization decision: the order in which a query’s predicates are evaluated. ” “ CACM 11/2009
  12. DHT Overlays in Logic 12 SOSP 2005

  13. A Hadoop Backend in Logic BOOM Analytics BOOM-FS: Rewrite of

    Hadoop’s HDFS in Overlog Replacement of Hadoop Scheduler in Overlog 13 EuroSys ‘10 We began by creating an Overlog implementation of basic Paxos … Our first effort was clean and fairly simple: 22 Overlog rules in 53 lines of code, corresponding nearly line-for-line with the invariants from Lamport’s original paper [21]. ” “
  14. What Worked Well Concise, declarative implementation with good performance 10-20x

    more concise than Java (LOCs) Similar performance (within 10-20%) Separation of policy and mechanism Ease of evolution 1. High availability (failover + Paxos) 2. Scalability (hash partitioned FS master) 3. Monitoring as an aspect
  15. What Worked Poorly Dodgy semantics As with Prolog, depended on

    understanding interpreter behavior In particular, 1. change (e.g., state update) 2. uncertainty (e.g., async communication)
  16. None
  17. None
  18. 18 Classical Consistency Mechanisms: Coordination Consensus: Simple: Get a (fixed)

    set of machines to choose the same thing (Paxos) SMR: Get a (fixed) set of machines to choose the same sequence of things (Multipaxos) Commit Protocols: Get a (fixed) set of machines to agree on transaction commit (Two-Phase Commit)
  19. 19 Coordination Avoidance the first principle of successful scalability is

    to batter the consistency mechanisms down to a minimum… make it as hard as possible for application developers to get permission to use them —James Hamilton (IBM, MS, Amazon) in Birman, Chockler: “Toward a Cloud Computing Research Agenda”, LADIS 2009 ” “
  20. 20 Our own experience… We didn’t need coordination for Network

    Routing! We did need it for certain DHT overlays that were “racy” We implemented it for BOOM-FS (Multipaxos)! When MUST we coordinate, when can we find something more clever? What is coordination FOR?!
  21. Open Problems and PACT Programming Logic Foundations Declarative Programming •

    Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches • Semantics: Dedalus & Bloom • Computability: The CALM Theorem Today: Lessons and Foundations
  22. 22 Dedalus and Bloom Dedalus Datalog in time and space

    Bloom A Ruby-based functional DSL based on Dedalus Extended to BloomL to incorporate lattice types (more shortly) http://bloom-lang.net
  23. 23 Dedalus: It’s About Time Datalog constrained by: Logical time

    as the rightmost attribute of all predicates. Restrictions on rules to track state over time “Time is what keeps everything from happening at once.” Ray Cummings, The Girl in the Golden Atom, 1922 Datalog 2010
  24. 24 Sugared Dedalus Deductive rule (immediate) Inductive rule (sequential) Async

    rule (deferred, communication) p(A,B) :- q(A,B). p(A,B)@next :- q(A,B). p(A,B)@async :- q(A,B).
  25. 25 Dedalus Deductive rule (immediate) Inductive rule (sequential) Async rule

    (deferred, communication) p(A,B,S) :- q(A,B,T), T=S. p(A,B,S) :- q(A,B,T), S=T+1. p(A,B,S) :- q(A,B,T), time(S), choose((A,B,T), (S)). PODS 1990
  26. Logic and time Key relationships: Sequentiality Mutual exclusion Atomicity Datalog:

    Relationships among facts Dedalus: Also, relationships between states
  27. Dedalus: Semantics Approach: model theory Datalog famously has a unique

    minimal model Which coincides with the least fixed point semantics! In Dedalus, each model is a trace 1. Infinite across time (hence minimality is tricky) 2. Non-deterministic across time, via the choice construct (hence uniqueness is tricky)
  28. Dedalus: Semantics Dedalus: from stable models to Ultimate Models Ultimate

    model: those facts that are eventually always true Programs with a unique ultimate model are confluent. Berkeley/Hasselt Tech Report, 2011 Datalog in Academia and Industry, LNCS 2012 TPLP 2015
  29. None
  30. 30 Conjecture (CALM): A distributed program P has a consistent,

    coordination-free distributed implementation if and only if it is monotonic. Hellerstein, PODS 2010 CALM: CONSISTENCY AS LOGICAL MONOTONICITY
  31. 31 Theorem (CALM): A distributed program P has a consistent,

    coordination-free distributed implementation if and only if it is monotonic. Ameloot, Neven, Van den Bussche, PODS 2011 CALM: CONSISTENCY AS LOGICAL MONOTONICITY
  32. 32 We’ll need some formal definitions

  33. 33 Consistency: Confluent Distributed Execution Definition: A distributed program P

    is consistent if it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication.
  34. 34 Consistency: Confluent Distributed Execution Definition: A distributed program P

    is consistent if it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication.
  35. 35 Monotonicity Definition: A distributed program P is consistent if

    it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication. Definition: A distributed program P is monotonic if for any input sets S, T if S ⊆ T, then P(S) ⊆ P(T).
  36. 36 Monotonicity Definition: A distributed program P is consistent if

    it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication. Definition: A distributed program P is monotonic if for any input sets S, T if S ⊆ T, then P(S) ⊆ P(T).
  37. 37 Coordination: Data-Independent Messaging Definition: A distributed program P is

    consistent if it is a deterministic function from sets to sets, regardless of non-deterministic message ordering and duplication. Definition: A distributed program P is monotonic if for any input sets S, T if S ⊆ T, then P(S) ⊆ P(T). Definition: A distributed program P(T) uses coordination if it requires messages to be sent under all possible input partitionings (shardings) of T.
  38. 38 Distributed Deadlock: Once you observe the existence of a

    waits-for cycle, you can (autonomously) declare deadlock. More information will not change the result. Garbage Collection: Suspecting garbage (the non-existence of a path from root) is not enough; more information may change the result. Hence you are required to check all nodes for information (under any assignment of objects to nodes!) Two Canonical Examples Deadlock! Garbage?
  39. 39 CALM (Consistency As Logical Monotonicity) Theorem (CALM): A distributed

    program P has a consistent, coordination-free distributed implementation if and only if it is monotonic. Hellerstein JM. The declarative imperative: Experiences and conjectures in distributed logic. ACM SIGMOD Record, Sep 2010. Ameloot TJ, Neven F, Van den Bussche J. Relational transducers for declarative networking. JACM, Apr 2013. Ameloot TJ, Ketsman B, Neven F, Zinn D. Weaker forms of monotonicity for declarative networking: a more fine-grained answer to the CALM-conjecture. ACM TODS, Feb 2016. Hellerstein JM, Alvaro P. Keeping CALM: When Distributed Consistency is Easy. CACM, September 2020.
  40. 40 Weaker forms of monotonicity Policy-aware transducers Suppose the transducer

    knows the policy for sharding Some “not exists” queries become coordination-free! A hierarchy of definitions of “weaker monotonicity” under “stronger policy knowledge” Deeper connections to Datalog variants Descriptive Complexity characterizations Is there more? PODS ‘14
  41. 41 There’s More to this Story… CACM Sep. 2020 SIGMOD

    Record Jun 2014
  42. Open Problems and PACT Programming Logic Foundations Declarative Programming •

    Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches • Join semilattices and CRDTs • Lattice composition Today: Lesson and Foundations • Semantics: Dedalus & Bloom • Computability: The CALM Theorem
  43. 43 My Systems Friends Forget all this logic stuff. All

    I want to talk about is stuff like reordering API calls.
  44. http://www.flickr.com/photos/25579597@N00/40955407/ CIDR 2009 Associative, Commutative Idempotent, Distributed

  45. Associative – (X ◦ Y) ◦ Z = X ◦

    (Y ◦ Z) – batch-insensitive Commutative – X ◦Y = Y ◦ X – order-insensitive Idempotent – X ◦ X = X – resend-insensitive Distributed – acronym-insensitive Associative, Commutative Idempotent, Distributed
  46. Storing an Integer VON NEUMANN int ctr; operator:= (x) {

    // assign ctr = x; } ACID 2.0 int ctr; operator<= (x) { // merge ctr = MAX(ctr, x); } DISORDERL Y INPUT STREAMS: 2, 5, 6, 7, 11, 22, 44, 91 5, 7, 2, 11, 44, 6, 22, 91, 5
  47. Storing an Integer VON NEUMANN ACID 2.0 DISORDERED INPUT STREAMS:

    2, 5, 6, 7, 11, 22, 44, 91 5, 7, 2, 11, 44, 6, 22, 91, 5 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9 0 10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9
  48. 48 Sounds familiar! Observation: ACID 2.0 is monotonic! Direct result

    of semi-lattice properties Observation: ACID 2.0 is confluent! a.k.a “eventually consistent” without coordination Monotonicity => Consistency
  49. 49 CRDTs ACI properties: join semilattices Convergent Replicated Data Type

    (CDRT) A class, with a merge function Merge is Associative, Commutative and Idempotent Implementations of many handy data structures as CRDTs
  50. 50 Problems: Scoping and Correctness Scope dilemma: Is my application

    a CRDT? E.g. Google Docs. How can I be sure?
  51. 51 Desire: Composition of Simple Lattices Write simple lattices that

    are easy to test How to compose lattices into larger programs? SOCC 2012 http://bloom-lang.net ICFP 2016, 2018
  52. Monotone Functions Set (Merge = Union) Increasing Int (Merge =

    Max) Boolean (Merge = Or) {a} {b} {c} {a,b} {b,c} {a,c} {a,b,c} 5 5 7 7 3 7 false false false true true true Can use monotone functions to map to other lattices! card(S) x > 6
  53. 53 Morphisms Monotone functions that distribute across Merge m(X ◦

    Y) = m(X) ◦ m(Y) Why do we care? Differentiation/Flow Delta-based database query processing tricks translate naturally (e.g. semi-naïve evaluation) {1,2} {1,3} {1,4} {2,3} {2,4} {3,4} {1,2,3} {1,2,4} {1,3,4} {2,3,4} {1,2,3,4} {1} {2} {3} {4} max(S) 2 3 4 3 4 4 1 2 3 4 a morphism!
  54. 54 Anna: Mutable State Encapsulated in Lattices A coordination-free KVS

    in C++ Simple lattices a la BloomL for various consistency levels Choose any HAT , we tag your data with the right lattice Anna runs without any coordination, locks or atomics Every core uses private memory Versions are gossiped/merged across cores Extremely simple, crazy fast and scalable VLDB 2013 ICDE 2019
  55. Open Problems and PACT Programming Logic Foundations Declarative Programming •

    Lessons learned from early languages and prototypes • The need for theory Algebraic Approaches • Join semilattices and CRDTs • Lattice composition • CALM Advances • Towards a Programmable Cloud Today: Lessons and Foundations • Semantics: Dedalus & Bloom • Computability: The CALM Theorem
  56. 56 Open Problems 1: Constructive/Relaxed CALM CALM says what’s possible.

    It does not help you achieve it. Constructive CALM Input: a monotonic input/output “problem” Desired output: a coordination-free implementation. Or some help designing it! Relaxed CALM Input: a non-monotonic problem specification Desired output: a “relaxed” specification that is monotonic and “similar”.
  57. 57 Open Problems 2: Stochastic CALM NeurIPS 2011 NeurIPS 2015

    A supermartingale [7] is a stochastic process Wt such that E[Wt+1|Wt] ≤ Wt . That is, the expected value is non-increasing over time.
  58. 58 Open Problems 3: PTIME in the cloud? Semi-positive Datalog:

    Datalog enhanced with: Negation on stored facts (only!) A successor relation (some source of order) Immerman-Vardi Theorem: Semipositive Datalog = PTIME A practical, weaker form of monotonicity corresponding to semi-positive Datalog?! What is our source of order? Is this simply coordination, or is there something weaker?
  59. New Directions in Cloud Programming ALVIN CHEUNG, NATACHA CROOKS, JOE

    HELLERSTEIN & MATTHEW MILANO UC BERKELEY
  60. A New PACT for Cloud Programming P Program Semantics A

    Availability C Consistency T Targets for Optimization A declaration of four independent facets Functionality modulo distribution Tolerate f independent failures Application-specific guarantees per API Multi-Objective Optimization
  61. ¡Viva La Evolución!

  62. HYDRO: A PACT Programming Stack … Cloud Services … FaaS

    Storage ML Frameworks Actors (e.g. Orleans) Functional (e.g. Spark) Logic (e.g. Bloom) Futures (e.g. Ray) P Program Semantics A Availability New DSLs HYDROLOGIC HYDRAULIC Verified Lifting HYDROLYSIS Compiler HYDROFLOW Deployment HYDROFLOW Program C Consistency Targets for Optimization T Sequential Code HYDRO Polyglot programming Declarative PACT IR Data/Lattice/Eventflow IR Adaptive self-deploying exe’s
  63. Many wonderful friends and colleagues

  64. Many wonderful friends and colleagues Key PhD Dissertations

  65. Joe Hellerstein hellerstein@berkeley.edu @joe_hellerstein 6 More Information CALM CACM: https://bit.ly/calmcacm

    Hydro Paper: http://bit.ly/hydroCIDR21 Hydro: https://hydro-project.github.io/ Blog Post: https://bit.ly/stateofserverlessart DSF@Berkeley: https://dsf.berkeley.edu
  66. 66 Backup Slides

  67. State Update Via Frame Rules p(A, B)@next :- p(A, B),

    ¬p_del(A, B). Example Trace: p(1, 2)@101; p(1, 3)@102; p_del(1, 3)@300; Time p(1, 2) p(1, 3) p_del(1, 3) 101 102 ... 300 301
  68. 68 The CALM Theorem Went Further! An oblivious transducer model

    does not access the following information: All (“membership”: a table of all nodes in the system) Id (“self-awareness”: the name of the local node) For queries Q in language L, the following are all equivalent! Q can be computed by a coordination-free L-transducer Q is monotonic Q can be computed by an oblivious L-transducer Practical implications of (non-)obliviousness for highly dynamic systems! E.g. serverless computing, edge/fog computing, p2p
  69. 69 Related Work Koutris & Suciu. “Parallel Evaluation of Conjunctive

    Queries.” PODS 2011 Beame, Koutris & Suciu. “Communication Steps for Parallel Query Processing.” PODS 213 Beame, Koutris & Suciu. “Skew in Parallel Query Processing”. PODS 2014 Koutris, Beame & Suciu. “Worst-Case Optimal Algorithms for Parallel Query Processing”. ICDT 2016
  70. 70 Towards a Solution Systems work is all about I/O

    & memory models. What if we reason about application semantics? With thanks to Peter Bailis…
  71. 71 Logic and Lattice Composition Bloom: a “disorderly” programming language

    based on logic and lattices Monotone composition of “lego blocks” lattices into bigger programs. Allows non-monotonic expressions but they require “extra” syntax. Syntactic CALM analysis Alvaro P, Conway N, Hellerstein JM, Marczak WR. Consistency Analysis in Bloom: a CALM and Collected Approach. CIDR 2011. Conway N, Marczak WR, Alvaro P, Hellerstein JM, Maier D. Logic and lattices for distributed programming. ACM SoCC, 2012.
  72. CloudBurst/Hydrocache: A Stateful Serverless Platform Competitive performance for a prediction

    serving pipeline. 200 400 600 800 1000 1200 1400 Python Droplet AWS Lambda (Mock) AWS SageMaker AWS Lambda (Actual) Latency (ms) 182.5 210.2 325.7 355.8 1181 191.5 277.4 411.3 416.6 1364 Performant consistency on a real-world web app. 1 10 100 1000 Droplet (LWW) Droplet (Causal) Redis Droplet (LWW) Droplet (Causal) Redis Reads Writes Latency (ms) 16.1 18.0 15.0 397 501 810 31.9 79 27.9 503 801 921.3 1 10 100 1000 10000 D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) D roplet (H ot) D roplet (C old) Lam bda (R edis) Lam bda (S3) Size: 80KB Size: 800KB Size: 8MB Size: 80MB Latency (ms) 2.8 5.6 32.7 346 4.7 21.1 100 1065 3.2 9.3 38.3 385 6.7 66.9 112 1630 6.4 59.8 253 506 17.2 279 392 2034 81.6 732 2646 1963 238 2743 5209 4250 End-to-end latency for a task with large data inputs.
  73. Intuition: Streaming Queries are Monotonic 𝜎 p 𝜋 f… ⋈

    p SELECT * FROM input WHERE … SELECT f(…), … FROM input SELECT * FROM i1 INNER JOIN i2 ON …
  74. Intuition: Blocking Queries are Non-Monotonic 𝛤 k,g 𝛤 k,b(g) SELECT

    key, COUNT(*) FROM input GROUP BY key SELECT key FROM input HAVING COUNT(*) > 10
  75. Intuition: Blocking Queries are Non-Monotonic 𝛤 k,g 𝛤 k,b(g) SELECT

    key, COUNT(*) FROM input GROUP BY key SELECT key FROM input HAVING COUNT(*) > 10 Except in special cases! We’ll come back to this…
  76. 76 Anna Serverless KVS • Anyscale: perform like Redis, scale

    like S3 • CALM consistency levels via simple lattices • Autoscaling & multitier serverless storage • Won best-of-conference at ICDE, VLDB1, 2 1 Wu, Chenggang, et al. "Anna: A kvs for any scale." IEEE Transactions on Knowledge and Data Engineering (2019). 2 Wu, Chenggang, Vikram Sreekanti, and Joseph M. Hellerstein. "Autoscaling tiered cloud storage in Anna." PVLDB 12.6 (2019): 624-638.
  77. 77 Anna Performance Shared-nothing at all scales (even across threads)

    Crazy fast under contention Up to 700x faster than Masstree within a multicore machine Up to 10x faster than Cassandra in a geo-distributed deployment Coordination-free consistency. No atomics, no locks, no waiting ever! 700x!
  78. 78 CALM Consistency Simple, clean lattice composition gives range of

    consistency levels Lines of C++ code modified by system component KEEP CALM AND WRITE(X)
  79. 79 Autoscaling & Multi-Tier Cost Tradeoffs 350x the performance of

    DynamoDB for the same price!
  80. Cloudburst: A Stateful Serverless Platform Main Challenge: Cache consistency! Hydrocache:

    new consistency protocols for distributed client “sessions” Compute Storage
  81. 81 Multiple Consistency Levels Here Too Read Atomic transactions AFT1:

    a fault tolerance shim layer between any FaaS and any object store • Currently evaluated between AWS Lambda and AWS S3! Multisite Transactional Causal Consistency (MTCC)2 Causal: Preserve Lamport’s happened before relation Multisite transactional: Nested functions running across multiple machines. 81 1Sreekanti, Vikram, et al. A Fault-Tolerance Shim for Serverless Computing. To appear, Eurosys (2020). 2Wu, Chenggang, et al. Transactional Causal Consistency for Serverless Computing. To appear, ACM SIGMOD (2020).
  82. Running a Twitter Clone on Cloudburst 1 10 100 1000

    Cloudburst (LWW) Cloudburst (Causal) Redis Cloudburst (LWW) Cloudburst (Causal) Redis Reads Writes Latency (ms) 16.1 18.0 15.0 397 501 810 31.9 79 27.9 503 801 921.3
  83. Prediction Serving on Cloudburst 200 400 600 800 1000 1200

    1400 Python Cloudburst AWS SageMaker AWS Lambda Latency (ms) 182.5 210.2 355.8 1181 191.5 277.4 416.6 1364