$30 off During Our Annual Pro Sale. View Details »

Handling data in distributed systems

Aviran Mordo
September 27, 2018

Handling data in distributed systems

Some of the patterns to handle data in a distributed system used by Wix.com

Aviran Mordo

September 27, 2018
Tweet

More Decks by Aviran Mordo

Other Decks in Programming

Transcript

  1. @aviranm Aviran Mordo, VP of Engineering, Wix.com Handling Data in

    Distributed Systems Arrested by the CAP Twitter: @aviranm linkedin/aviran aviransplace.com
  2. @aviranm Wix.com in Numbers Vilnius Kiev Dnipro Tel-Aviv Be’er Sheva

    130M website builders (+2M monthly) 600M monthly visitors Multiple clouds & data centers (Google, Amazon) 2000 Employees (~50% R&D) 5 R&D centers #5 best software companies to work for worldwide (according to Glassdoor) Ukraine Israel Lithuania
  3. @aviranm AGENDA Avoiding database transactions Handling database schema changes Read

    consistency in a distributed system Dealing with multiple datacenters
  4. @aviranm Logical DB transaction Saving a Wix Site’s Data Site

    Pages DB Save page(s) 1. Save each page as an atomic operation 2. Finalize transaction by sending site header (pointers to pages) Can generate orphaned pages, not a problem in practice Site Header DB Save header Browser Editor Server Save page(s) Save header List of page IDs
  5. @aviranm Write Traffic may Flow to Both Datacenters Pages MySQL

    Pages MySQL DC-2 DC-1 Browser Browser Save page Save page
  6. @aviranm Replication Conflict Pages MySQL Pages MySQL MySQL strategy Stop

    replication or Ignore conflict (drop incoming) DC-2 DC-1 Wix users change millions of pages every day.
  7. @aviranm Pages MySQL Pages MySQL DB Conflicts can be safely

    ignored as content is identical Page ID is a content-based hash: • Immutable data • Idempotent operation Avoiding Replication Conflicts DC-2 DC-1
  8. @aviranm Database Changes 1. Add Fields 2. Remove Fields 3.

    Complete Schema / Database Change Altering very large tables may take a very long time and cause downtime.
  9. @aviranm Database Changes 1. Add Fields 2. Remove Fields 3.

    Complete Schema / Database Change 1.1. For adding metadata (non- indexed fields) Use a blob field for schema flexibility (JSON works really well).
  10. @aviranm Database Changes 1. Add Fields 2. Remove Fields 3.

    Complete Schema / Database Change 1.1. For adding metadata (non- indexed fields) Use a blob field for schema flexibility (JSON works really well). 1.2. If the fields are searchable (indexed) Use another table and join by primary key.
  11. @aviranm Database Changes 1. Add Fields 2. Remove Fields 3.

    Complete Schema / Database Change 1.1. For adding metadata (non- indexed fields) Use a blob field for schema flexibility (JSON works really well). 1.2. If the fields are searchable (indexed fields) Use another table and join by primary key. 2. Stop using it in the code. Do not do any DB schema changes.
  12. @aviranm Database Changes 1. Add Fields 2. Remove Fields 3.

    Complete Schema / Database Change 1.1. For adding metadata (non- indexed fields) Use a blob field for schema flexibility (JSON works really well). 1.2. If the fields are searchable (indexed fields) Use another table and join by primary key. 2. Stop using it in the code. Do not do any DB schema changes. 3. Lazy migration
  13. @aviranm Feature Toggle = Code branch Not just a Boolean,

    can also be a state. Can have criteria: Company employees Specific users / group Percentage of traffic By GEO By Language By user-agent User Profile based Any other context… FT Open New Code Old Code FT Open http://github.com/wix/petri
  14. @aviranm New DB Schema with Data Migration Deploy the new

    schema/DB Plan a lazy migration path controlled by feature toggle
  15. @aviranm Point of No Return Warning! Distributed Transaction Fail on

    write to old, “ignore" failure on new #1 Backward compatibility is a must! Your old DB is now read-only and will not change. #2 Write to both (first old then new) / Read from old #3 Write to both / Read from New, fallback to old #6 Write and Read to new - Remove migration code #5 Eagerly migrate data in the background #4 Write only to New / Read from new, fallback to old Write to old / Read from old http://www.aviransplace.com/2015/12/15/safe-database-migration-pattern-without-downtime/
  16. @aviranm Product Service Slave DB Master DB UpdateProduct(…) Save data

    Replicate Store owner updates a product’s details
  17. @aviranm Product Service Slave DB Master DB GetProduct(…) Replicate Read

    data Usually not an issue... Store owner wants to view a product for update
  18. @aviranm Product Service Slave DB Master DB GetProduct(…) Replicate Read

    data Store owner wants to view a product for update ...unless there’s a replication lag.
  19. @aviranm Product Service Slave DB Master DB GetConsistentProduct(…) Separate API

    for consistent reads Read data Replicate Store owner wants to view a product for update
  20. @aviranm GetConsistentProduct(…) Multiple Data Centers Product Service Slave DB Master

    DB Read data Replicate DC-1 Product Service Slave DB Master DB Read data Replicate DC-2 Replicate GetConsistentProduct(…)
  21. @aviranm GetConsistentProduct(…) Product Service Slave DB Master DB Read data

    Replicate DC-1 Product Service Slave DB Master DB Read data Replicate DC-2 Replicate GetConsistentProduct(…) Inconsistent data Cross DC Replication Lag
  22. @aviranm Cross DC Flows DC-1 DC-2 Load Balancer Load Balancer

    Product Service Slave DB Master DB Read data Replicate Product Service Slave DB Master DB Read data Replicate Replicate
  23. @aviranm Master DC Configure Master DC in the LB Configure

    API-level Stickiness DC-1 GetConsistentProduct(…) GetConsistentProduct(…) Product Service Slave DB Master DB Read data Replicate DC-1 Product Service Slave DB Master DB Read data Replicate DC-2 Load Balancer Load Balancer Replicate
  24. @aviranm Master DC Configure Master DC in the LB Configure

    API-level Stickiness DC-1 GetConsistentProduct(…) GetConsistentProduct(…) Product Service Slave DB Master DB Read data Replicate DC-1 Product Service Slave DB Master DB Read data Replicate DC-2 Load Balancer Load Balancer Replicate Pros: • Fine grain control over API • No changes for the service Cons: • Complicated LB configuration • Multiple connection strings (one for master and one for replica DB
  25. @aviranm Master DC Configure Master DC in the LB Configure

    Service-level Stickiness DC-1 GetConsistentProduct(…) Product Write Service Slave DB Master DB Replicate DC-1 Slave DB Master DB Replicate DC-2 Replicate Load Balancer Load Balancer Product Read Service Product Write Service Product Read Service
  26. @aviranm Master DC Configure Master DC in the LB Configure

    Service-level Stickiness DC-1 GetConsistentProduct(…) Product Write Service Slave DB Master DB Replicate DC-1 Slave DB Master DB Replicate DC-2 Replicate Load Balancer Load Balancer Product Read Service Product Write Service Product Read Service Pros: • No multiple DB connection strings • Simpler LB configuration • Fits microservices architecture best practice • Better for scaling read services Cons: • More complicated system (adding another microservice) • Additional service for the client to talk with
  27. @aviranm Master DC Configure Master DC in the SQL Proxy

    DC-1 GetConsistentProduct(…) Slave DB Master DB Replicate DC-1 Slave DB Master DB Replicate DC-2 Replicate Load Balancer Load Balancer Product Service Product Service SQL Proxy SQL Proxy
  28. @aviranm Master DC Configure Master DC in the SQL Proxy

    DC-1 GetConsistentProduct(…) Slave DB Master DB Replicate DC-1 Slave DB Master DB Replicate DC-2 Replicate Load Balancer Load Balancer Product Service Product Service SQL Proxy SQL Proxy Pros: • Simple microservice DB configuration • DB replication lag monitoring • Adds DB maintenance flexibility Cons: • Adding DB access latency • Take away control from the developers
  29. @aviranm Product Service Slave DB Master DB Read data Replicate

    Product Service Slave DB Master DB Read data Replicate Replicate Browser Client Routing GetProduct(…) GetProduct(…) DC-1 DC-2
  30. @aviranm Master DC GetConsistentProduct(…) Product Service Slave DB Master DB

    Read data Replicate DC-1 Product Service Slave DB Master DB Read data Replicate DC-2 Replicate GetProduct(…) Browser Client Routing GetProduct(…)
  31. @aviranm Master DC GetConsistentProduct(…) Product Service Slave DB Master DB

    Read data Replicate DC-1 Product Service Slave DB Master DB Read data Replicate DC-2 Replicate GetProduct(…) Browser Client Routing GetProduct(…) Pros: • Fine grain control over API • Simpler DC configuration Cons: • Complicated client configuration • Traffic changes need to update all clients with new config
  32. @aviranm RECAP Option 1– API-level cross DC Option 2 –

    Separate Service Option 3 - ProxySQL (pin to DC) Option 4 – Client routing
  33. @aviranm WHAT WE DO AT WIX Option 1– API-level cross

    DC Option 2 – Separate Service Option 3 - ProxySQL (pin to DC) Option 4 – Client routing
  34. @aviranm Informing the users of eventual consistency processes Your changes

    are being applied, it may take few minutes to show up on the site…
  35. @aviranm Arrow -> Distributed System Avoiding database transactions Handling database

    schema changes Read consistency in a distributed system Dealing with multiple datacenters