A Certain Tendency of the Database Community Christopher S. Meiklejohn Université catholique de Louvain, Belgium Instituto Superior Técnico, Portugal 1 LIGHT ONE
Certain Tendency • Certain tendency Replicated databases are treated as a “single” system; the databases are the “source of truth” • Data ownership by clients Data is “owned” by the clients that create the data and data exists as soon as it is created 2
Certain Tendency • Certain tendency Replicated databases are treated as a “single” system; the databases are the “source of truth” • Data ownership by clients Data is “owned” by the clients that create the data and data exists as soon as it is created • Database is an optimization, bottleneck Databases serve as a “convenience” that make it easier to write applications: think: shared memory registers: however, reduced availability 2
Certain Tendency • Certain tendency Replicated databases are treated as a “single” system; the databases are the “source of truth” • Data ownership by clients Data is “owned” by the clients that create the data and data exists as soon as it is created • Database is an optimization, bottleneck Databases serve as a “convenience” that make it easier to write applications: think: shared memory registers: however, reduced availability • The edge is the source of truth! We need models and abstractions that allow us to write correct applications that operate with distributed data where it is being generated: the edge 2
Consistency Models • Contract Between the application developer and the system that application will be deployed on • Guaranteed outcomes following certain rules Event interleaving, possible partial-orders, update visibility, when and where, etc. 4
Consistency Models • Contract Between the application developer and the system that application will be deployed on • Guaranteed outcomes following certain rules Event interleaving, possible partial-orders, update visibility, when and where, etc. • Required for building applications Otherwise, we may pick a system to deploy our application on where our application returns incorrect results 4
Strong vs. Weak • Strong Linearizability is the strongest; respects the “real-time” order of events • Weak Eventual consistency; informally specified, no bound on when an update may be visible 5
Eventual Consistency 6 “...the storage system guarantees that if no new updates are made to the [replicated, shared] object, eventually all accesses [to any replica] will return the last updated value.” - W. Vogels
Eventual Consistency 6 “...the storage system guarantees that if no new updates are made to the [replicated, shared] object, eventually all accesses [to any replica] will return the last updated value.” - W. Vogels Rather weak model, but used by many large-scale distributed systems today…
CAP Theorem • Consistency is at odds with availability If systems wish to remain functioning under network partitions, systems must sacrifice one or the other 8
CAP Theorem • Consistency is at odds with availability If systems wish to remain functioning under network partitions, systems must sacrifice one or the other • Consistency Guarantees on event order and event visibility 8
CAP Theorem • Consistency is at odds with availability If systems wish to remain functioning under network partitions, systems must sacrifice one or the other • Consistency Guarantees on event order and event visibility • Availability Ability for a system to keep servicing requests under network partitions and/or failures 8
CAP Example • Two replicas of an reservation system… Replicated for fault-tolerance to ensure system availability • Two concurrent requests… Tom and Chris attempt to reserve the last available seat on a plane 9
CAP Example • Two replicas of an reservation system… Replicated for fault-tolerance to ensure system availability • Two concurrent requests… Tom and Chris attempt to reserve the last available seat on a plane • Two possible paths… If one replica of the system can not reach the other replica, we have two choices: 9
CAP Example • Two replicas of an reservation system… Replicated for fault-tolerance to ensure system availability • Two concurrent requests… Tom and Chris attempt to reserve the last available seat on a plane • Two possible paths… If one replica of the system can not reach the other replica, we have two choices: • [Favoring Consistency] Prevent booking Return an error to the user and prevent both bookings 9
CAP Example • Two replicas of an reservation system… Replicated for fault-tolerance to ensure system availability • Two concurrent requests… Tom and Chris attempt to reserve the last available seat on a plane • Two possible paths… If one replica of the system can not reach the other replica, we have two choices: • [Favoring Consistency] Prevent booking Return an error to the user and prevent both bookings • [Favoring Availability] Allow concurrent requests However, now the seat is double booked and we must have a “conflict resolution” function for returning the system to a consistent state 9
Recorded Knowledge • Approximation Approximation of globally known knowledge that is periodically recorded • “Potentially outdated” Act of recording this information produces an artifact that is already outdated unless the system has quiesced 11
Message Passing • Exchange messages Members of the same system exchange messages asynchronously • Dropped or delayed Messages can either be dropped or delayed 12
Message Passing • Exchange messages Members of the same system exchange messages asynchronously • Dropped or delayed Messages can either be dropped or delayed • Examples Letters via the postal service; Text messages; Telephone calls 12
Primary Site • Ownership of information Each member in the system owns the primary copy of their information • Coordinates updates Members coordinate updates to information they are the primary site for 13
Primary Site • Ownership of information Each member in the system owns the primary copy of their information • Coordinates updates Members coordinate updates to information they are the primary site for • Information can be cached Information from other sites can be cached by other members in the system 13
Primary Site • Ownership of information Each member in the system owns the primary copy of their information • Coordinates updates Members coordinate updates to information they are the primary site for • Information can be cached Information from other sites can be cached by other members in the system • Local or incomplete replica Use memory 13
Primary Site • Ownership of information Each member in the system owns the primary copy of their information • Coordinates updates Members coordinate updates to information they are the primary site for • Information can be cached Information from other sites can be cached by other members in the system • Local or incomplete replica Use memory • Stale replica Outdated printed map 13
Primary Site • Ownership of information Each member in the system owns the primary copy of their information • Coordinates updates Members coordinate updates to information they are the primary site for • Information can be cached Information from other sites can be cached by other members in the system • Local or incomplete replica Use memory • Stale replica Outdated printed map • Primary site Google Maps or the USGS, etc. 13
Database: an Optimization • Graph of primary copy locations Represents all members in the system with the data they create and are responsible for • Contract edges for subgraph Reduce several vertices in the graph to a single vertex: database for those entities 15
Database: an Optimization • Graph of primary copy locations Represents all members in the system with the data they create and are responsible for • Contract edges for subgraph Reduce several vertices in the graph to a single vertex: database for those entities • Geo-replicated, EC database Contracted edges per country, placing a replica in each country that served as the primary copy 15
Database: an Optimization • Graph of primary copy locations Represents all members in the system with the data they create and are responsible for • Contract edges for subgraph Reduce several vertices in the graph to a single vertex: database for those entities • Geo-replicated, EC database Contracted edges per country, placing a replica in each country that served as the primary copy • Wikipedia (for a given topic) Information about a given topic is stored here, written and coordinated by multiple authors 15
Why Optimize? • Expensive Retrieval from the primary site is expensive, if the primary site is geographically distant [latency] or unavailable [availability] 16
Why Optimize? • Expensive Retrieval from the primary site is expensive, if the primary site is geographically distant [latency] or unavailable [availability] • Replication introduces challenges Replication can make maintaining consistency much more challenges if caching/replication is pervasive 16
IoT and Mobile Applications • Centralized won’t scale Storing all data at a central location for processing won’t scale due to power and DC requirements 17
IoT and Mobile Applications • Centralized won’t scale Storing all data at a central location for processing won’t scale due to power and DC requirements • Today’s systems assume centralization Both programming models and applications used today assume centralization of data (ie. Spark, etc.) 17
Database as a Constraint Satisfaction Problem • Where do we route requests? When we need to retrieve a certain piece of data, how do we know where to route the request to? 20
Database as a Constraint Satisfaction Problem • Where do we route requests? When we need to retrieve a certain piece of data, how do we know where to route the request to? • How we do specify acceptable staleness? Do we need to route to the primary site or can we use a cache? Does that cache provide a value within an acceptable value of staleness? 20
Database as a Constraint Satisfaction Problem • Where do we route requests? When we need to retrieve a certain piece of data, how do we know where to route the request to? • How we do specify acceptable staleness? Do we need to route to the primary site or can we use a cache? Does that cache provide a value within an acceptable value of staleness? • How do we bound latency? How do we select an appropriate cache? How do we choose between a cache and a primary site given we have to match a latency bound? 20
Database as a Constraint Satisfaction Problem • Where do we route requests? When we need to retrieve a certain piece of data, how do we know where to route the request to? • How we do specify acceptable staleness? Do we need to route to the primary site or can we use a cache? Does that cache provide a value within an acceptable value of staleness? • How do we bound latency? How do we select an appropriate cache? How do we choose between a cache and a primary site given we have to match a latency bound? • How do we reason about staleness? Across multiple requests for the same object, how do we know which version is newer or older? 20
Solution #1 Mergeable Data Structures • Abstract data types for AP/EC systems Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result Conflict resolution and provenance information 21
Solution #1 Mergeable Data Structures • Abstract data types for AP/EC systems Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result Conflict resolution and provenance information • One example: CRDTs Conflict-free Replicated Data Types (Shapiro et al. 2011) 21
Solution #1 Mergeable Data Structures • Abstract data types for AP/EC systems Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result Conflict resolution and provenance information • One example: CRDTs Conflict-free Replicated Data Types (Shapiro et al. 2011) • Causality Capture causality for object mutations and can identify concurrent operations 21
Solution #1 Mergeable Data Structures • Abstract data types for AP/EC systems Encapsulate AP replication concerns and exist in time and space • Merge to most “recent” result Conflict resolution and provenance information • One example: CRDTs Conflict-free Replicated Data Types (Shapiro et al. 2011) • Causality Capture causality for object mutations and can identify concurrent operations • Concurrency Resolve concurrent operations using a bias (think: concurrent add(e) || remove(e) on the same set for same element) 21
Solution #2 Programming Model • Based on mergeable data structures Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures Ensure the mergeability property holds through program transformations, data composition 22
Solution #2 Programming Model • Based on mergeable data structures Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures Ensure the mergeability property holds through program transformations, data composition • One example: Lasp Lattice Processing (Meiklejohn, Van Roy 2015) 22
Solution #2 Programming Model • Based on mergeable data structures Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures Ensure the mergeability property holds through program transformations, data composition • One example: Lasp Lattice Processing (Meiklejohn, Van Roy 2015) • Correct-by-construction Correct-by-construction distributed programs for infrastructure that provides weak guarantees 22
Solution #2 Programming Model • Based on mergeable data structures Mergeable data structures form the core data abstraction for a programming model • Programming through composition of mergeable data structures Ensure the mergeability property holds through program transformations, data composition • One example: Lasp Lattice Processing (Meiklejohn, Van Roy 2015) • Correct-by-construction Correct-by-construction distributed programs for infrastructure that provides weak guarantees • Result provenance Extends CRDT causality/concurrency tracking through to results of applications providing mergeable outcomes 22
Solution #3 Remove Role Dichotomy • Eliminate client-server dichotomy Servers shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data 23
Solution #3 Remove Role Dichotomy • Eliminate client-server dichotomy Servers shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data • Clients communicate other clients Exchange state for latency reduction, serve as the primary site for their information 23
Solution #3 Remove Role Dichotomy • Eliminate client-server dichotomy Servers shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data • Clients communicate other clients Exchange state for latency reduction, serve as the primary site for their information • Servers as business entities Necessary for latency reduction of large data sets, durability, location of “exactly-once” side-effects: ie. charge credit card 23
Solution #3 Remove Role Dichotomy • Eliminate client-server dichotomy Servers shouldn’t be responsible for canonical data and data sharing, but rather serve as a location where particular code will run with clients data • Clients communicate other clients Exchange state for latency reduction, serve as the primary site for their information • Servers as business entities Necessary for latency reduction of large data sets, durability, location of “exactly-once” side-effects: ie. charge credit card • One example: Skype Completely peer-to-peer for operation, but a central server is used for authentication and storage of users “address book.” 23
What about Causality? • Misnomer “Eventually Consistent World” We know that causality drives interactions in the physical world: relativity, light cones, etc. 25
What about Causality? • Misnomer “Eventually Consistent World” We know that causality drives interactions in the physical world: relativity, light cones, etc. • Causality in distributed systems Happens-before relationship (Lamport 1978) describes capturing causal relationships between entities in a distributed system 25
Causality Tradeoffs • Benefits Simplifies the development of systems Reason about cause/effect; eliminates storage and maintenance of redundant information 26
Causality Tradeoffs • Benefits Simplifies the development of systems Reason about cause/effect; eliminates storage and maintenance of redundant information • Negatives Expensive in storage and maintenance of causal message delivery channels; methods for reduction in state introduce false dependencies 26
Where’s the disconnect? • What about causal consistency? Does causal consistency provide a better formalism for describing consistency in the world given we know causal relationships hold? 27
Where’s the disconnect? • What about causal consistency? Does causal consistency provide a better formalism for describing consistency in the world given we know causal relationships hold? • Data has decay Some information may no longer be important after a given period of time, and forgotten 27
Where’s the disconnect? • What about causal consistency? Does causal consistency provide a better formalism for describing consistency in the world given we know causal relationships hold? • Data has decay Some information may no longer be important after a given period of time, and forgotten • Causality formalism needs explicit decay The formalism for describing causal consistency requires data be explicitly “decayed” through messages (or tombstones — keeping data forever to satisfy the causal relationship) 27
Causality Example By Analogy • Driver’s license example I learned a bunch of rules about driving a car and passed a driver’s test obtaining a license, which I renewed several years later 28
Causality Example By Analogy • Driver’s license example I learned a bunch of rules about driving a car and passed a driver’s test obtaining a license, which I renewed several years later • What does causality imply? Causality captures that I would not have had the license if I had not learned the rules and passed the test 28
Causality Example By Analogy • Driver’s license example I learned a bunch of rules about driving a car and passed a driver’s test obtaining a license, which I renewed several years later • What does causality imply? Causality captures that I would not have had the license if I had not learned the rules and passed the test • What does causal consistency imply? Does causal consistency imply that I can still recover the information that I originally used to pass the test? Does it imply that if I can recall my driver’s license number that I should be able to recall the rules that are implied by that license number? 28
Conclusion • Some adoption of ideas: “Uber goes Unconventional” Places canonical ride state on the device to bootstrap datacenters under failure: device is source of truth 29
Conclusion • Some adoption of ideas: “Uber goes Unconventional” Places canonical ride state on the device to bootstrap datacenters under failure: device is source of truth • Datacenter-focused designs are limiting Impractical from a storage, bandwidth, and power perspective 29
Conclusion • Some adoption of ideas: “Uber goes Unconventional” Places canonical ride state on the device to bootstrap datacenters under failure: device is source of truth • Datacenter-focused designs are limiting Impractical from a storage, bandwidth, and power perspective • Emerging countries have limited access Sneakernet, USB, Bluetooth still the pervasive model of communication 29
Conclusion • Some adoption of ideas: “Uber goes Unconventional” Places canonical ride state on the device to bootstrap datacenters under failure: device is source of truth • Datacenter-focused designs are limiting Impractical from a storage, bandwidth, and power perspective • Emerging countries have limited access Sneakernet, USB, Bluetooth still the pervasive model of communication • Peer-to-peer designs can provide higher-scale Grow to planetary scale, new programming models needed to embrace these network designs, new abstractions needed 29