Slide 1

Slide 1 text

A Certain Tendency of the Database Community Christopher S. Meiklejohn Université catholique de Louvain, Belgium Instituto Superior Técnico, Portugal 1 LIGHT ONE

Slide 2

Slide 2 text

Certain Tendency • Certain tendency
 Replicated databases are treated as a “single” system; the databases are the “source of truth” 2

Slide 3

Slide 3 text

Certain Tendency • Certain tendency
 Replicated databases are treated as a “single” system; the databases are the “source of truth” • Data ownership by clients
 Data is “owned” by the clients that create the data and data exists as soon as it is created 2

Slide 4

Slide 4 text

Certain Tendency • Certain tendency
 Replicated databases are treated as a “single” system; the databases are the “source of truth” • Data ownership by clients
 Data is “owned” by the clients that create the data and data exists as soon as it is created • Database is an optimization, bottleneck
 Databases serve as a “convenience” that make it easier to write applications: think: shared memory registers: however, reduced availability 2

Slide 5

Slide 5 text

Certain Tendency • Certain tendency
 Replicated databases are treated as a “single” system; the databases are the “source of truth” • Data ownership by clients
 Data is “owned” by the clients that create the data and data exists as soon as it is created • Database is an optimization, bottleneck
 Databases serve as a “convenience” that make it easier to write applications: think: shared memory registers: however, reduced availability • The edge is the source of truth!
 We need models and abstractions that allow us to write correct applications that operate with distributed data where it is being generated: the edge 2

Slide 6

Slide 6 text

Background Consistency 3

Slide 7

Slide 7 text

Consistency Models • Contract
 Between the application developer and the system that application will be deployed on 4

Slide 8

Slide 8 text

Consistency Models • Contract
 Between the application developer and the system that application will be deployed on • Guaranteed outcomes following certain rules
 Event interleaving, possible partial-orders, update visibility, when and where, etc. 4

Slide 9

Slide 9 text

Consistency Models • Contract
 Between the application developer and the system that application will be deployed on • Guaranteed outcomes following certain rules
 Event interleaving, possible partial-orders, update visibility, when and where, etc. • Required for building applications
 Otherwise, we may pick a system to deploy our application on where our application returns incorrect results 4

Slide 10

Slide 10 text

Strong vs. Weak • Strong
 Linearizability is the strongest; respects the “real-time” order of events 5

Slide 11

Slide 11 text

Strong vs. Weak • Strong
 Linearizability is the strongest; respects the “real-time” order of events • Weak
 Eventual consistency; informally specified, no bound on when an update may be visible 5

Slide 12

Slide 12 text

Eventual Consistency 6 “...the storage system guarantees that if no new updates are made to the [replicated, shared] object, eventually all accesses [to any replica] will return the last updated value.” - W. Vogels

Slide 13

Slide 13 text

Eventual Consistency 6 “...the storage system guarantees that if no new updates are made to the [replicated, shared] object, eventually all accesses [to any replica] will return the last updated value.” - W. Vogels Rather weak model, but used by many large-scale distributed systems today…

Slide 14

Slide 14 text

7 Why pick a weaker model when stronger models exist?

Slide 15

Slide 15 text

CAP Theorem • Consistency is at odds with availability
 If systems wish to remain functioning under network partitions, systems must sacrifice one or the other 8

Slide 16

Slide 16 text

CAP Theorem • Consistency is at odds with availability
 If systems wish to remain functioning under network partitions, systems must sacrifice one or the other • Consistency
 Guarantees on event order and event visibility 8

Slide 17

Slide 17 text

CAP Theorem • Consistency is at odds with availability
 If systems wish to remain functioning under network partitions, systems must sacrifice one or the other • Consistency
 Guarantees on event order and event visibility • Availability
 Ability for a system to keep servicing requests under network partitions and/or failures 8

Slide 18

Slide 18 text

CAP Example • Two replicas of an reservation system…
 Replicated for fault-tolerance to ensure system availability 9

Slide 19

Slide 19 text

CAP Example • Two replicas of an reservation system…
 Replicated for fault-tolerance to ensure system availability • Two concurrent requests…
 Tom and Chris attempt to reserve the last available seat on a plane 9

Slide 20

Slide 20 text

CAP Example • Two replicas of an reservation system…
 Replicated for fault-tolerance to ensure system availability • Two concurrent requests…
 Tom and Chris attempt to reserve the last available seat on a plane • Two possible paths…
 If one replica of the system can not reach the other replica, we have two choices: 9

Slide 21

Slide 21 text

CAP Example • Two replicas of an reservation system…
 Replicated for fault-tolerance to ensure system availability • Two concurrent requests…
 Tom and Chris attempt to reserve the last available seat on a plane • Two possible paths…
 If one replica of the system can not reach the other replica, we have two choices: • [Favoring Consistency] Prevent booking
 Return an error to the user and prevent both bookings 9

Slide 22

Slide 22 text

CAP Example • Two replicas of an reservation system…
 Replicated for fault-tolerance to ensure system availability • Two concurrent requests…
 Tom and Chris attempt to reserve the last available seat on a plane • Two possible paths…
 If one replica of the system can not reach the other replica, we have two choices: • [Favoring Consistency] Prevent booking
 Return an error to the user and prevent both bookings • [Favoring Availability] Allow concurrent requests
 However, now the seat is double booked and we must have a “conflict resolution” function for returning the system to a consistent state 9

Slide 23

Slide 23 text

Real World Analogies Eventual Consistency 10

Slide 24

Slide 24 text

Real World Analogies Eventual Consistency 10 Does the physical world favor availability over consistency?

Slide 25

Slide 25 text

Recorded Knowledge • Approximation
 Approximation of globally known knowledge that is periodically recorded 11

Slide 26

Slide 26 text

Recorded Knowledge • Approximation
 Approximation of globally known knowledge that is periodically recorded • “Potentially outdated”
 Act of recording this information produces an artifact that is already outdated unless the system has quiesced 11

Slide 27

Slide 27 text

Message Passing • Exchange messages
 Members of the same system exchange messages asynchronously 12

Slide 28

Slide 28 text

Message Passing • Exchange messages
 Members of the same system exchange messages asynchronously • Dropped or delayed
 Messages can either be dropped or delayed 12

Slide 29

Slide 29 text

Message Passing • Exchange messages
 Members of the same system exchange messages asynchronously • Dropped or delayed
 Messages can either be dropped or delayed • Examples
 Letters via the postal service;
 Text messages;
 Telephone calls 12

Slide 30

Slide 30 text

Primary Site • Ownership of information
 Each member in the system owns the primary copy of their information 13

Slide 31

Slide 31 text

Primary Site • Ownership of information
 Each member in the system owns the primary copy of their information • Coordinates updates
 Members coordinate updates to information they are the primary site for 13

Slide 32

Slide 32 text

Primary Site • Ownership of information
 Each member in the system owns the primary copy of their information • Coordinates updates
 Members coordinate updates to information they are the primary site for • Information can be cached
 Information from other sites can be cached by other members in the system 13

Slide 33

Slide 33 text

Primary Site • Ownership of information
 Each member in the system owns the primary copy of their information • Coordinates updates
 Members coordinate updates to information they are the primary site for • Information can be cached
 Information from other sites can be cached by other members in the system • Local or incomplete replica
 Use memory 13

Slide 34

Slide 34 text

Primary Site • Ownership of information
 Each member in the system owns the primary copy of their information • Coordinates updates
 Members coordinate updates to information they are the primary site for • Information can be cached
 Information from other sites can be cached by other members in the system • Local or incomplete replica
 Use memory • Stale replica
 Outdated printed map 13

Slide 35

Slide 35 text

Primary Site • Ownership of information
 Each member in the system owns the primary copy of their information • Coordinates updates
 Members coordinate updates to information they are the primary site for • Information can be cached
 Information from other sites can be cached by other members in the system • Local or incomplete replica
 Use memory • Stale replica
 Outdated printed map • Primary site
 Google Maps or the USGS, etc. 13

Slide 36

Slide 36 text

What really are Databases? 14

Slide 37

Slide 37 text

Database: an Optimization • Graph of primary copy locations
 Represents all members in the system with the data they create and are responsible for 15

Slide 38

Slide 38 text

Database: an Optimization • Graph of primary copy locations
 Represents all members in the system with the data they create and are responsible for • Contract edges for subgraph
 Reduce several vertices in the graph to a single vertex: database for those entities 15

Slide 39

Slide 39 text

Database: an Optimization • Graph of primary copy locations
 Represents all members in the system with the data they create and are responsible for • Contract edges for subgraph
 Reduce several vertices in the graph to a single vertex: database for those entities • Geo-replicated, EC database
 Contracted edges per country, placing a replica in each country that served as the primary copy 15

Slide 40

Slide 40 text

Database: an Optimization • Graph of primary copy locations
 Represents all members in the system with the data they create and are responsible for • Contract edges for subgraph
 Reduce several vertices in the graph to a single vertex: database for those entities • Geo-replicated, EC database
 Contracted edges per country, placing a replica in each country that served as the primary copy • Wikipedia (for a given topic)
 Information about a given topic is stored here, written and coordinated by multiple authors 15

Slide 41

Slide 41 text

Why Optimize? • Expensive
 Retrieval from the primary site is expensive, if the primary site is geographically distant [latency] or unavailable [availability] 16

Slide 42

Slide 42 text

Why Optimize? • Expensive
 Retrieval from the primary site is expensive, if the primary site is geographically distant [latency] or unavailable [availability] • Replication introduces challenges
 Replication can make maintaining consistency much more challenges if caching/replication is pervasive 16

Slide 43

Slide 43 text

IoT and Mobile Applications • Centralized won’t scale
 Storing all data at a central location for processing won’t scale due to power and DC requirements 17

Slide 44

Slide 44 text

IoT and Mobile Applications • Centralized won’t scale
 Storing all data at a central location for processing won’t scale due to power and DC requirements • Today’s systems assume centralization
 Both programming models and applications used today assume centralization of data (ie. Spark, etc.) 17

Slide 45

Slide 45 text

18 We are moving towards large-scale edge computation!

Slide 46

Slide 46 text

Edge Computation The Living Database 19

Slide 47

Slide 47 text

Database as a Constraint Satisfaction Problem • Where do we route requests?
 When we need to retrieve a certain piece of data, how do we know where to route the request to? 20

Slide 48

Slide 48 text

Database as a Constraint Satisfaction Problem • Where do we route requests?
 When we need to retrieve a certain piece of data, how do we know where to route the request to? • How we do specify acceptable staleness?
 Do we need to route to the primary site or can we use a cache? Does that cache provide a value within an acceptable value of staleness? 20

Slide 49

Slide 49 text

Database as a Constraint Satisfaction Problem • Where do we route requests?
 When we need to retrieve a certain piece of data, how do we know where to route the request to? • How we do specify acceptable staleness?
 Do we need to route to the primary site or can we use a cache? Does that cache provide a value within an acceptable value of staleness? • How do we bound latency?
 How do we select an appropriate cache? How do we choose between a cache and a primary site given we have to match a latency bound? 20

Slide 50

Slide 50 text

Database as a Constraint Satisfaction Problem • Where do we route requests?
 When we need to retrieve a certain piece of data, how do we know where to route the request to? • How we do specify acceptable staleness?
 Do we need to route to the primary site or can we use a cache? Does that cache provide a value within an acceptable value of staleness? • How do we bound latency?
 How do we select an appropriate cache? How do we choose between a cache and a primary site given we have to match a latency bound? • How do we reason about staleness?
 Across multiple requests for the same object, how do we know which version is newer or older? 20

Slide 51

Slide 51 text

Solution #1 Mergeable Data Structures • Abstract data types for AP/EC systems
 Encapsulate AP replication concerns and exist in time and space 21

Slide 52

Slide 52 text

Solution #1 Mergeable Data Structures • Abstract data types for AP/EC systems
 Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result
 Conflict resolution and provenance information 21

Slide 53

Slide 53 text

Solution #1 Mergeable Data Structures • Abstract data types for AP/EC systems
 Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result
 Conflict resolution and provenance information • One example: CRDTs
 Conflict-free Replicated Data Types (Shapiro et al. 2011) 21

Slide 54

Slide 54 text

Solution #1 Mergeable Data Structures • Abstract data types for AP/EC systems
 Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result
 Conflict resolution and provenance information • One example: CRDTs
 Conflict-free Replicated Data Types (Shapiro et al. 2011) • Causality
 Capture causality for object mutations and can identify concurrent operations 21

Slide 55

Slide 55 text

Solution #1 Mergeable Data Structures • Abstract data types for AP/EC systems
 Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result
 Conflict resolution and provenance information • One example: CRDTs
 Conflict-free Replicated Data Types (Shapiro et al. 2011) • Causality
 Capture causality for object mutations and can identify concurrent operations • Concurrency
 Resolve concurrent operations using a bias (think: concurrent add(e) || remove(e) on the same set for same element) 21

Slide 56

Slide 56 text

Solution #2 Programming Model • Based on mergeable data structures
 Mergeable data structures form the core data abstraction for a programming model 22

Slide 57

Slide 57 text

Solution #2 Programming Model • Based on mergeable data structures
 Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures
 Ensure the mergeability property holds through program transformations, data composition 22

Slide 58

Slide 58 text

Solution #2 Programming Model • Based on mergeable data structures
 Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures
 Ensure the mergeability property holds through program transformations, data composition • One example: Lasp
 Lattice Processing (Meiklejohn, Van Roy 2015) 22

Slide 59

Slide 59 text

Solution #2 Programming Model • Based on mergeable data structures
 Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures
 Ensure the mergeability property holds through program transformations, data composition • One example: Lasp
 Lattice Processing (Meiklejohn, Van Roy 2015) • Correct-by-construction
 Correct-by-construction distributed programs for infrastructure that provides weak guarantees 22

Slide 60

Slide 60 text

Solution #2 Programming Model • Based on mergeable data structures
 Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures
 Ensure the mergeability property holds through program transformations, data composition • One example: Lasp
 Lattice Processing (Meiklejohn, Van Roy 2015) • Correct-by-construction
 Correct-by-construction distributed programs for infrastructure that provides weak guarantees • Result provenance
 Extends CRDT causality/concurrency tracking through to results of applications providing mergeable outcomes 22

Slide 61

Slide 61 text

Solution #3 Remove Role Dichotomy • Eliminate client-server dichotomy
 Servers shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data 23

Slide 62

Slide 62 text

Solution #3 Remove Role Dichotomy • Eliminate client-server dichotomy
 Servers shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data • Clients communicate other clients
 Exchange state for latency reduction, serve as the primary site for their information 23

Slide 63

Slide 63 text

Solution #3 Remove Role Dichotomy • Eliminate client-server dichotomy
 Servers shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data • Clients communicate other clients
 Exchange state for latency reduction, serve as the primary site for their information • Servers as business entities
 Necessary for latency reduction of large data sets, durability, location of “exactly-once” side-effects: ie. charge credit card 23

Slide 64

Slide 64 text

Solution #3 Remove Role Dichotomy • Eliminate client-server dichotomy
 Servers shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data • Clients communicate other clients
 Exchange state for latency reduction, serve as the primary site for their information • Servers as business entities
 Necessary for latency reduction of large data sets, durability, location of “exactly-once” side-effects: ie. charge credit card • One example: Skype
 Completely peer-to-peer for operation, but a central server is used for authentication and storage of users “address book.” 23

Slide 65

Slide 65 text

Addendum Causality 24

Slide 66

Slide 66 text

What about Causality? • Misnomer “Eventually Consistent World”
 We know that causality drives interactions in the physical world: relativity, light cones, etc. 25

Slide 67

Slide 67 text

What about Causality? • Misnomer “Eventually Consistent World”
 We know that causality drives interactions in the physical world: relativity, light cones, etc. • Causality in distributed systems
 Happens-before relationship (Lamport 1978) describes capturing causal relationships between entities in a distributed system 25

Slide 68

Slide 68 text

Causality Tradeoffs • Benefits
 Simplifies the development of systems
 Reason about cause/effect; eliminates storage and maintenance of redundant information 26

Slide 69

Slide 69 text

Causality Tradeoffs • Benefits
 Simplifies the development of systems
 Reason about cause/effect; eliminates storage and maintenance of redundant information • Negatives
 Expensive in storage and maintenance of causal message delivery channels; methods for reduction in state introduce false dependencies 26

Slide 70

Slide 70 text

Where’s the disconnect? • What about causal consistency?
 Does causal consistency provide a better formalism for describing consistency in the world given we know causal relationships hold? 27

Slide 71

Slide 71 text

Where’s the disconnect? • What about causal consistency?
 Does causal consistency provide a better formalism for describing consistency in the world given we know causal relationships hold? • Data has decay
 Some information may no longer be important after a given period of time, and forgotten 27

Slide 72

Slide 72 text

Where’s the disconnect? • What about causal consistency?
 Does causal consistency provide a better formalism for describing consistency in the world given we know causal relationships hold? • Data has decay
 Some information may no longer be important after a given period of time, and forgotten • Causality formalism needs explicit decay
 The formalism for describing causal consistency requires data be explicitly “decayed” through messages (or tombstones — keeping data forever to satisfy the causal relationship) 27

Slide 73

Slide 73 text

Causality Example By Analogy • Driver’s license example
 I learned a bunch of rules about driving a car and passed a driver’s test obtaining a license, which I renewed several years later 28

Slide 74

Slide 74 text

Causality Example By Analogy • Driver’s license example
 I learned a bunch of rules about driving a car and passed a driver’s test obtaining a license, which I renewed several years later • What does causality imply?
 Causality captures that I would not have had the license if I had not learned the rules and passed the test 28

Slide 75

Slide 75 text

Causality Example By Analogy • Driver’s license example
 I learned a bunch of rules about driving a car and passed a driver’s test obtaining a license, which I renewed several years later • What does causality imply?
 Causality captures that I would not have had the license if I had not learned the rules and passed the test • What does causal consistency imply?
 Does causal consistency imply that I can still recover the information that I originally used to pass the test? Does it imply that if I can recall my driver’s license number that I should be able to recall the rules that are implied by that license number? 28

Slide 76

Slide 76 text

Conclusion • Some adoption of ideas: “Uber goes Unconventional”
 Places canonical ride state on the device to bootstrap datacenters under failure: device is source of truth 29

Slide 77

Slide 77 text

Conclusion • Some adoption of ideas: “Uber goes Unconventional”
 Places canonical ride state on the device to bootstrap datacenters under failure: device is source of truth • Datacenter-focused designs are limiting
 Impractical from a storage, bandwidth, and power perspective 29

Slide 78

Slide 78 text

Conclusion • Some adoption of ideas: “Uber goes Unconventional”
 Places canonical ride state on the device to bootstrap datacenters under failure: device is source of truth • Datacenter-focused designs are limiting
 Impractical from a storage, bandwidth, and power perspective • Emerging countries have limited access
 Sneakernet, USB, Bluetooth still the pervasive model of communication 29

Slide 79

Slide 79 text

Conclusion • Some adoption of ideas: “Uber goes Unconventional”
 Places canonical ride state on the device to bootstrap datacenters under failure: device is source of truth • Datacenter-focused designs are limiting
 Impractical from a storage, bandwidth, and power perspective • Emerging countries have limited access
 Sneakernet, USB, Bluetooth still the pervasive model of communication • Peer-to-peer designs can provide higher-scale
 Grow to planetary scale, new programming models needed to embrace these network designs, new abstractions needed 29

Slide 80

Slide 80 text

30 Thanks! Christopher Meiklejohn @cmeik http://www.lasp-lang.org