Slide 1

Slide 1 text

Joan Touzet ❦ https://atypical.net/ ❦ wohali

Slide 2

Slide 2 text

CouchDB & Apache Contributor / User (~2008) Committer (Feb 2013) PMC member (April 2014) ASF Member (2015) Apache Board of Directors (2019) 2

Slide 3

Slide 3 text

3

Slide 4

Slide 4 text

• The “original” NoSQL (…but we were provably first!) • Document-oriented structure • Map-Reduce • Streaming changes feeds 4

Slide 5

Slide 5 text

5

Slide 6

Slide 6 text

• Couch file – holds a binary tree (B-Tree) – 1 file per database or view group (1 design document =1 view group) – Databases: indexed by ID and by sequence number – Views: holds one binary tree, key space per view in a design doc • replicator – “just a client process”. – Source  Target. Multi-master & bidirectional. • http layer + authentication 6

Slide 7

Slide 7 text

CouchDB 1.x, by itself, was a fully consistent database. Unintentionally. When replicating with another DB, it was eventually consistent, but with document conflicts. 7

Slide 8

Slide 8 text

8 bob v1 bob v1 bob v2a bob v2b bob v2a v2b bob v2a v2b

Slide 9

Slide 9 text

9 Clustered HTTP Clustered CouchDB API Layer (Dynamo Model) Low-latency, Highly parallel Remote call (RPC) library Magic “consistent” shard mapping database “basically” CouchDB 1.x, but with enhancements

Slide 10

Slide 10 text

CouchDB 2.x has native clustering functionality “Internal replication” is optimized for this process CouchDB 2.x shards the database for optimization CouchDB has no leader election or “global coordinator”! 10

Slide 11

Slide 11 text

11 q = # of shards (default: 8) (4 here for a good picture) n = number of replicas (default: 3)

Slide 12

Slide 12 text

12 CouchDB 1.x CouchDB 2.x HTTP 1 2 3 Erlang

Slide 13

Slide 13 text

13 bob v1 bob v1 bob v1 00:00.000

Slide 14

Slide 14 text

14 bob v2a bob v1 bob v2b 00:01.000

Slide 15

Slide 15 text

15 bob v2a bob v2b bob v2b 00:01.001

Slide 16

Slide 16 text

16 bob v2a bob v2b bob v2b ✕ 00:01.002

Slide 17

Slide 17 text

17 bob v2a bob v2b bob v2b ✕ ✕ 00:01.003

Slide 18

Slide 18 text

18 bob v2a bob v2b bob v2b copies = 2 n = 3 Quorum OK copies = 1 n = 3 Quorum NG 00:01.004 Quorum: ≥ +1 2 copies

Slide 19

Slide 19 text

19 bob v2a bob v2b bob v2b copies = 2 n = 3 Quorum OK copies = 1 n = 3 Quorum NG 201 Created 00:01.009 202 Accepted

Slide 20

Slide 20 text

20 bob v2a bob v2b bob v2a bob v2b bob v2a bob v2b 00:01.010 bob v2a “arbitrarily” wins!

Slide 21

Slide 21 text

You bet. But that’s eventual consistency for you. Q: What if “Blue” and “Purple” are the same app with 2 consecutive writes?! Applications need to design around this: • Single application writer per document, or • Clearly defined hand-offs between different stages of processing, or • Stream-based model (documents never modified), or • Database-per-user model 21

Slide 22

Slide 22 text

CouchDB 3.0 will be “the best CouchDB 2.x,” adding: • Per-document access restrictions • Automated shard splitting • Automatic view warming • Better automatic compaction • “Ready for Lucene Search” (without a recompile) • Optional highly tunable I/O queue (IOQ2) • …plus a long-term support (LTS) strategy • …and the same semantics as CouchDB 2.x. 22

Slide 23

Slide 23 text

CouchDB 4.0 will have a new storage layer based on FoundationDB. • Fully consistent, distributed data store • 10 years in the making by a dedicated development team • Intended as the underlying infrastructure for other Databases only • CouchDB implemented as a “Layer” on top of FoundationDB – CouchDB 4.0 is a completely stateless application layer for FoundationDB. 23

Slide 24

Slide 24 text

24 • CouchDB FDB Layer implements CouchDB (1.x) semantics and indexes • FDB is a consistent MVCC key-value store using PAXOS coordination and a transactional authority • FDB can be a single instance (on your Raspberry Pi or laptop) or a cluster of hundreds of Linux machines

Slide 25

Slide 25 text

• Written in C++, using actor-based concurrency (very similar to Erlang) • Uses ACID-compliant transactions – This allows us to bring back CouchDB 1.x semantics! (And keep our ‘crash-proof’ design.) – User-visible transactions may come to a future CouchDB! • Imposes some restrictions: – 10MB per transaction – 5 seconds per transaction – Keys and values have size restrictions (10k and 100k respectively) • CouchDB documents will be broken up into multiple FoundationDB keys and values 25

Slide 26

Slide 26 text

• CouchDB 4.0 will have: – CouchDB 1.0 semantics – CouchDB 2.0 clustering – Plus more new features yet to be announced. 26

Slide 27

Slide 27 text

27 Joan Touzet ❦ https://atypical.net/ ❦ wohali