Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Resilient Infrastructure with CouchDB

Building Resilient Infrastructure with CouchDB

CouchDB is one of the relatively smaller NoSQL options that are flying around at the moment, but that doesn't mean it doesn't pack a punch when used to solve the right problems.In this talk we'll look at the areas where CouchDB excels, and examine some of the mechanisms it uses to make this possible. In addition, we'll take a quick walk through a real deployment of a CouchDB network, backing a large multi-site private-cloud web service with millions of users, and look at some of the benefits and problems CouchDB can bring practically, in this scenario and others.

Tim Perry

April 30, 2014
Tweet

More Decks by Tim Perry

Other Decks in Programming

Transcript

  1. Building resilient infrastructure with CouchDB Tim Perry Tech Lead &

    Open-Source Champion at Softwire tim-perry.co.uk @pimterry github.com/pimterry
  2. Document Store { " _ i d " : "

    m y - d o c u m e n t - e x a m p l e " , " _ r e v " : " 2 1 - q w e 1 2 3 a s d " , " s o m e - c o n t e n t " : { " a " : 1 , " b " : 2 } , " a l i s t ! " : [ 3 , 4 , 5 ] }
  3. HTTP API $ c u r l - X G

    E T h t t p : / / c o u c h d b : 5 9 8 4 / m y - d b / a - d o c - i d { " _ i d " : " a - d o c - i d " " _ r e v " : " 4 - 9 8 1 2 e o j a w d " " d a t a " : [ 1 , 2 , 3 ] }
  4. HTTP API $ c u r l - X P

    U T h t t p : / / c o u c h d b : 5 9 8 4 / m y - d b / a n o t h e r - i d \ - H ' C o n t e n t - T y p e : a p p l i c a t i o n / j s o n ' \ - d ' { " o t h e r d a t a " : 4 } ' { " o k " : t r u e , " i d " : " a n o t h e r - i d " , " r e v " : " 1 - 2 9 0 2 1 9 1 5 5 5 " }
  5. Replication # P u l l f r o m

    B > A $ c u r l - X P O S T h t t p : / / c o u c h d b - A : 5 9 8 4 / _ r e p l i c a t o r \ - H ' C o n t e n t - T y p e : a p p l i c a t i o n / j s o n ' \ - d ' { " s o u r c e " : " h t t p : / / c o u c h d b - B : 5 9 8 4 / d e m o - d b " , " t a r g e t " : " d e m o - d b " , " c o n t i n u o u s " : t r u e } ' # P u l l f r o m A - > B $ c u r l - X P O S T h t t p : / / c o u c h d b - B : 5 9 8 4 / _ r e p l i c a t o r \ - H ' C o n t e n t - T y p e : a p p l i c a t i o n / j s o n ' \ - d ' { " s o u r c e " : " h t t p : / / c o u c h d b - A : 5 9 8 4 / d e m o - d b " , " t a r g e t " : " d e m o - d b " , " c o n t i n u o u s " : t r u e } '
  6. Let's break everything! w h i l e t r

    u e d o c u r l - X P O S T ' h t t p : / / c o u c h d b - A : 5 9 8 4 / d e m o - d b ' \ - H " c o n t e n t - t y p e : a p p l i c a t i o n / j s o n " \ - d ' { " c r e a t e d _ a t " : " ' " ` d a t e ` " ' " } ' \ - - m a x - t i m e 0 . 1 c u r l - X P O S T ' h t t p : / / c o u c h d b - B : 5 9 8 4 / d e m o - d b ' \ - H " c o n t e n t - t y p e : a p p l i c a t i o n / j s o n " \ - d ' { " c r e a t e d _ a t " : " ' " ` d a t e ` " ' " } ' \ - - m a x - t i m e 0 . 1 d o n e w h i l e t r u e d o v a g r a n t h a l t c o u c h d b - a - - f o r c e s l e e p 3 0 v a g r a n t u p c o u c h d b - a - - n o - p r o v i s i o n s l e e p 3 0 v a g r a n t h a l t c o u c h d b - b - - f o r c e s l e e p 3 0 v a g r a n t u p c o u c h d b - b - - n o - p r o v i s i o n s l e e p 3 0 d o n e (Some console logging omitted)
  7. Real World Example (Anonymized) B2B SaaS product, with strict SLAs

    Millions of paying daily users 3,000 servers across 25 datacentres 50,000 requests per second, average Highly latency sensitive Every request needs the (readonly) user session
  8. Bonus Challenges Struggling network infrastructure Frequent loss of connection to

    datacentres Occasional power outages in datacentres Users can and do roam, worldwide Server failover is always to a different datacentre Data centres have hub & spoke connectivity only (through London)
  9. Previous Solution Hold all user sessions in memory on every

    server Announce new sessions to every server with a central message queue Canonical store kept in a single RDBMS (for server initialisation)
  10. Real World Problems Memory usage doesn't scale Network and server

    failures are big problems Message queue failures are catastrophic problems
  11. CouchDB Solution Small LRU cache in every server CouchDB in

    every datacentre CouchDB in the central datacentre Hub & spoke replication Servers query local CouchDB by default, or fall back to central CouchDB
  12. Hoodie Save data $ ( ' . a d d

    T a s k . s u b m i t ' ) . c l i c k ( f u n c t i o n ( ) { v a r d e s c = $ ( ' . a d d T a s k . d e s c ' ) . v a l ( ) ; h o o d i e . s t o r e . a d d ( ' t a s k ' , { d e s c : d e s c } ) ; } ) ;
  13. Hoodie Handle new data h o o d i e

    . s t o r e . o n ( ' a d d : t a s k ' , f u n c t i o n ( t a s k ) { $ ( ' . t a s k L i s t ' ) . a p p e n d ( ' < l i > ' + t a s k . d e s c + ' < / l i > ' ) ; } ) ;
  14. Hoodie Log in users $ ( ' . l o

    g i n ' ) . c l i c k ( f u n c t i o n ( ) { v a r u s e r n a m e = $ ( " . u s e r n a m e " ) . v a l ( ) ; v a r p a s s w o r d = $ ( " . p a s s w o r d " ) . v a l ( ) ; h o o d i e . a c c o u n t . s i g n I n ( u s e r n a m e , p a s s w o r d ) . d o n e ( l o g i n S u c c e s s f u l ) ; } ) ;
  15. Reliable Replication The Changes Feed $ c u r l

    - X G E T h t t p : / / c o u c h d b : 5 9 8 4 / m y - d b / _ c h a n g e s ? s i n c e = 1 { " r e s u l t s " : [ { " s e q " : 2 , " i d " : " m y - d o c " , " c h a n g e s " : [ { " r e v " : " 1 - 1 2 8 q w 9 9 " } ] } , { " s e q " : 3 , " i d " : " m y - d o c " , " c h a n g e s " : [ { " r e v " : " 2 - 9 8 s 9 1 2 3 " } ] } , ] , l a s t _ s e q : 3 }
  16. Reliable Replication Replication Process 1. Track the source's sequence number

    in a local-only metadata document in the target DB, unique to this replication, set to 0 initially 2. Read the changes from the source, since the sequence id stored in the local document in the target 3. Read any missing document revisions from the source DB 4. Write these updates to the target DB 5. Update the sequence number tracked in the target 6. Go to 2 (Paraphrased from http://replication.io)
  17. Did we break everything? w h i l e t

    r u e d o c u r l - X P O S T ' h t t p : / / c o u c h d b - A : 5 9 8 4 / d e m o - d b ' \ - H " c o n t e n t - t y p e : a p p l i c a t i o n / j s o n " \ - d ' { " c r e a t e d _ a t " : " ' " ` d a t e ` " ' " } ' \ - - m a x - t i m e 0 . 1 c u r l - X P O S T ' h t t p : / / c o u c h d b - B : 5 9 8 4 / d e m o - d b ' \ - H " c o n t e n t - t y p e : a p p l i c a t i o n / j s o n " \ - d ' { " c r e a t e d _ a t " : " ' " ` d a t e ` " ' " } ' \ - - m a x - t i m e 0 . 1 d o n e w h i l e t r u e d o v a g r a n t h a l t c o u c h d b - a - - f o r c e s l e e p 3 0 v a g r a n t u p c o u c h d b - a - - n o - p r o v i s i o n s l e e p 3 0 v a g r a n t h a l t c o u c h d b - b - - f o r c e s l e e p 3 0 v a g r a n t u p c o u c h d b - b - - n o - p r o v i s i o n s l e e p 3 0 d o n e (Some console logging omitted)
  18. Any questions? Tim Perry Tech Lead & Open-Source Champion at

    Softwire tim-perry.co.uk @pimterry github.com/pimterry