Slide 1

Slide 1 text

Building resilient infrastructure with CouchDB Tim Perry Tech Lead & Open-Source Champion at Softwire tim-perry.co.uk @pimterry github.com/pimterry

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Document Store { " _ i d " : " m y - d o c u m e n t - e x a m p l e " , " _ r e v " : " 2 1 - q w e 1 2 3 a s d " , " s o m e - c o n t e n t " : { " a " : 1 , " b " : 2 } , " a l i s t ! " : [ 3 , 4 , 5 ] }

Slide 5

Slide 5 text

HTTP API $ c u r l - X G E T h t t p : / / c o u c h d b : 5 9 8 4 / m y - d b / a - d o c - i d { " _ i d " : " a - d o c - i d " " _ r e v " : " 4 - 9 8 1 2 e o j a w d " " d a t a " : [ 1 , 2 , 3 ] }

Slide 6

Slide 6 text

HTTP API $ c u r l - X P U T h t t p : / / c o u c h d b : 5 9 8 4 / m y - d b / a n o t h e r - i d \ - H ' C o n t e n t - T y p e : a p p l i c a t i o n / j s o n ' \ - d ' { " o t h e r d a t a " : 4 } ' { " o k " : t r u e , " i d " : " a n o t h e r - i d " , " r e v " : " 1 - 2 9 0 2 1 9 1 5 5 5 " }

Slide 7

Slide 7 text

Replication # P u l l f r o m B > A $ c u r l - X P O S T h t t p : / / c o u c h d b - A : 5 9 8 4 / _ r e p l i c a t o r \ - H ' C o n t e n t - T y p e : a p p l i c a t i o n / j s o n ' \ - d ' { " s o u r c e " : " h t t p : / / c o u c h d b - B : 5 9 8 4 / d e m o - d b " , " t a r g e t " : " d e m o - d b " , " c o n t i n u o u s " : t r u e } ' # P u l l f r o m A - > B $ c u r l - X P O S T h t t p : / / c o u c h d b - B : 5 9 8 4 / _ r e p l i c a t o r \ - H ' C o n t e n t - T y p e : a p p l i c a t i o n / j s o n ' \ - d ' { " s o u r c e " : " h t t p : / / c o u c h d b - A : 5 9 8 4 / d e m o - d b " , " t a r g e t " : " d e m o - d b " , " c o n t i n u o u s " : t r u e } '

Slide 8

Slide 8 text

Indexed Views Incremental Map/Reduce ACID (locally) Erlang-based Web UI Show Functions Filters Validation

Slide 9

Slide 9 text

Resilient Infrastructure

Slide 10

Slide 10 text

Let's break everything! w h i l e t r u e d o c u r l - X P O S T ' h t t p : / / c o u c h d b - A : 5 9 8 4 / d e m o - d b ' \ - H " c o n t e n t - t y p e : a p p l i c a t i o n / j s o n " \ - d ' { " c r e a t e d _ a t " : " ' " ` d a t e ` " ' " } ' \ - - m a x - t i m e 0 . 1 c u r l - X P O S T ' h t t p : / / c o u c h d b - B : 5 9 8 4 / d e m o - d b ' \ - H " c o n t e n t - t y p e : a p p l i c a t i o n / j s o n " \ - d ' { " c r e a t e d _ a t " : " ' " ` d a t e ` " ' " } ' \ - - m a x - t i m e 0 . 1 d o n e w h i l e t r u e d o v a g r a n t h a l t c o u c h d b - a - - f o r c e s l e e p 3 0 v a g r a n t u p c o u c h d b - a - - n o - p r o v i s i o n s l e e p 3 0 v a g r a n t h a l t c o u c h d b - b - - f o r c e s l e e p 3 0 v a g r a n t u p c o u c h d b - b - - n o - p r o v i s i o n s l e e p 3 0 d o n e (Some console logging omitted)

Slide 11

Slide 11 text

Is this useful?

Slide 12

Slide 12 text

Real World Example (Anonymized) B2B SaaS product, with strict SLAs Millions of paying daily users 3,000 servers across 25 datacentres 50,000 requests per second, average Highly latency sensitive Every request needs the (readonly) user session

Slide 13

Slide 13 text

Bonus Challenges Struggling network infrastructure Frequent loss of connection to datacentres Occasional power outages in datacentres Users can and do roam, worldwide Server failover is always to a different datacentre Data centres have hub & spoke connectivity only (through London)

Slide 14

Slide 14 text

Previous Solution Hold all user sessions in memory on every server Announce new sessions to every server with a central message queue Canonical store kept in a single RDBMS (for server initialisation)

Slide 15

Slide 15 text

Real World Problems Memory usage doesn't scale Network and server failures are big problems Message queue failures are catastrophic problems

Slide 16

Slide 16 text

CouchDB Solution Small LRU cache in every server CouchDB in every datacentre CouchDB in the central datacentre Hub & spoke replication Servers query local CouchDB by default, or fall back to central CouchDB

Slide 17

Slide 17 text

Real World Improvements No single point of failure Scales horizontally easily Major memory savings

Slide 18

Slide 18 text

Some Challenges Ops ramp-up Support service setup Disk usage

Slide 19

Slide 19 text

Hoodie http://hood.ie

Slide 20

Slide 20 text

Hoodie No Backend Offline-First

Slide 21

Slide 21 text

Hoodie Save data $ ( ' . a d d T a s k . s u b m i t ' ) . c l i c k ( f u n c t i o n ( ) { v a r d e s c = $ ( ' . a d d T a s k . d e s c ' ) . v a l ( ) ; h o o d i e . s t o r e . a d d ( ' t a s k ' , { d e s c : d e s c } ) ; } ) ;

Slide 22

Slide 22 text

Hoodie Handle new data h o o d i e . s t o r e . o n ( ' a d d : t a s k ' , f u n c t i o n ( t a s k ) { $ ( ' . t a s k L i s t ' ) . a p p e n d ( ' < l i > ' + t a s k . d e s c + ' < / l i > ' ) ; } ) ;

Slide 23

Slide 23 text

Hoodie Log in users $ ( ' . l o g i n ' ) . c l i c k ( f u n c t i o n ( ) { v a r u s e r n a m e = $ ( " . u s e r n a m e " ) . v a l ( ) ; v a r p a s s w o r d = $ ( " . p a s s w o r d " ) . v a l ( ) ; h o o d i e . a c c o u n t . s i g n I n ( u s e r n a m e , p a s s w o r d ) . d o n e ( l o g i n S u c c e s s f u l ) ; } ) ;

Slide 24

Slide 24 text

Hoodie Architecture (From the Hoodie team at http://hood.ie/intro#magic, CC-BY-SA-NC)

Slide 25

Slide 25 text

Hoodie Future Architecture (Probably) (Modified, from the Hoodie team's diagram at http://hood.ie/intro#magic, CC-BY-SA-NC)

Slide 26

Slide 26 text

Why does any of this work?

Slide 27

Slide 27 text

Reliable Replication

Slide 28

Slide 28 text

Multiversion Concurrency Control (or MVCC)

Slide 29

Slide 29 text

Reliable Replication The Changes Feed $ c u r l - X G E T h t t p : / / c o u c h d b : 5 9 8 4 / m y - d b / _ c h a n g e s ? s i n c e = 1 { " r e s u l t s " : [ { " s e q " : 2 , " i d " : " m y - d o c " , " c h a n g e s " : [ { " r e v " : " 1 - 1 2 8 q w 9 9 " } ] } , { " s e q " : 3 , " i d " : " m y - d o c " , " c h a n g e s " : [ { " r e v " : " 2 - 9 8 s 9 1 2 3 " } ] } , ] , l a s t _ s e q : 3 }

Slide 30

Slide 30 text

Reliable Replication Replication Process 1. Track the source's sequence number in a local-only metadata document in the target DB, unique to this replication, set to 0 initially 2. Read the changes from the source, since the sequence id stored in the local document in the target 3. Read any missing document revisions from the source DB 4. Write these updates to the target DB 5. Update the sequence number tracked in the target 6. Go to 2 (Paraphrased from http://replication.io)

Slide 31

Slide 31 text

Append-Only B+ Trees

Slide 32

Slide 32 text

Append-Only B+ Trees (From 'CouchDB: The Definitive Guide', CC-BY)

Slide 33

Slide 33 text

Did we break everything? w h i l e t r u e d o c u r l - X P O S T ' h t t p : / / c o u c h d b - A : 5 9 8 4 / d e m o - d b ' \ - H " c o n t e n t - t y p e : a p p l i c a t i o n / j s o n " \ - d ' { " c r e a t e d _ a t " : " ' " ` d a t e ` " ' " } ' \ - - m a x - t i m e 0 . 1 c u r l - X P O S T ' h t t p : / / c o u c h d b - B : 5 9 8 4 / d e m o - d b ' \ - H " c o n t e n t - t y p e : a p p l i c a t i o n / j s o n " \ - d ' { " c r e a t e d _ a t " : " ' " ` d a t e ` " ' " } ' \ - - m a x - t i m e 0 . 1 d o n e w h i l e t r u e d o v a g r a n t h a l t c o u c h d b - a - - f o r c e s l e e p 3 0 v a g r a n t u p c o u c h d b - a - - n o - p r o v i s i o n s l e e p 3 0 v a g r a n t h a l t c o u c h d b - b - - f o r c e s l e e p 3 0 v a g r a n t u p c o u c h d b - b - - n o - p r o v i s i o n s l e e p 3 0 d o n e (Some console logging omitted)

Slide 34

Slide 34 text

Phew.

Slide 35

Slide 35 text

CouchDB is not perfect

Slide 36

Slide 36 text

But 'always available' is a great superpower

Slide 37

Slide 37 text

Any questions? Tim Perry Tech Lead & Open-Source Champion at Softwire tim-perry.co.uk @pimterry github.com/pimterry