Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Wix Architecture At Scale

Wix Architecture At Scale

QCon London 2014.
In this talk I will go over Wix's architecture, how we evolved our system to be highly available even at the worst case scenarios when everything can break, how we built a self-healing eventual consistency system for website data distribution and will show some of the patterns we use that helps us render 45M websites while maintaining a relatively low number of servers.

Aviran Mordo

March 06, 2014
Tweet

More Decks by Aviran Mordo

Other Decks in Technology

Transcript

  1. Wix in Numbers Over 45,000,000 users 1M new users/month Static

    storage is >800TB of data 1.5TB new files/day 3 data centers + 2 clouds (Google, Amazon) 300 servers 700M HTTP requests/day 600 people work at Wix, of which ~ 200 in R&D
  2. Initial Architecture Built for fast development Stateful login (Tomcat session),

    Ehcache, file uploads No consideration for performance, scalability and testing Intended for short-term use Tomcat, Hibernate, custom web framework Lighttpd (file serving) MySQL DB Wix (Tomcat)
  3. The Monolithic Giant One monolithic server that handled everything Dependency

    between features Changes in unrelated areas of the system caused deployment of the whole system Failure in unrelated areas will cause system wide downtime
  4. Concerns and SLA Data Validation Security / Authentication Data consistency

    Lots of data Edit websites High availability High performance Lots of static files Very high traffic volume Viewport optimization Cacheable data Serving Media High availability High performance High traffic volume Long tail View sites, created by Wix editor
  5. Making SOA Guidelines Each service has its own database (if

    one is needed) Only one service can write to a specific DB There may be additional read-only services that directly accesses the DB (for performance reasons) Services are stateless No DB transactions Cache is not a building block, but an optimization
  6. Editor Server Immutable JSON pages (~2.5M / day) Site revisions

    Active – standby MySQL cross datacenters Editor Server MySQL Active Sites MySQL Archive
  7. Protect The Data Protect against DB outage with fast recovery

    = replication Protect against data poisoning/corruption = revisions / backup Make the data available at all times = data distribution to multiple locations / providers
  8. Browser Editor Server Static Grid Notify Google Cloud Storage MySQL

    Active Sites MySQL Archive Notify Saving Editor Data Archive (Amazon) Archive (Google) Save Page(s) 200 OK Upload Save Page DC replication Download Page MySQL Archive MySQL Active Sites
  9. Browser Editor Server Static Grid Save Page(s) Save Page Upload

    Notify Download Page Google Cloud Storage MySQL Archive MySQL Active Sites MySQL Archive DC replication Notify Self Healing Process Archive (Amazon) Archive (Google) MySQL Active Sites 200 OK
  10. No DB Transactions Save each page (JSON) as an atomic

    operation Page ID is a content based hash (immutable/idempotent) Finalize transaction by sending site header (list of pages) Can generate orphaned pages, not a problem in practice
  11. Prospero – Wix Media Storage 800TB user media files 3M

    files uploaded daily 500M metadata records Dynamic media processing • Picture resize, crop and sharpen “on the fly” • Watermark • Audio format conversion
  12. x36 T x36 T x32 Austin Prospero – Wix Media

    Manager get image.jpg First fallback Second fallback If not in CDN Google Cloud x36 T x36 T x32 Tampa CDN
  13. Public Segment Roles Routing (resolve URLs) Dispatching (to a renderer)

    Rendering (HTML,XML,TXT) Public Server HTML Renderer HTML SEO Renderer Flash Renderer Sitemap Renderer Robots.txt Renderer www.example.com Flash SEO Renderer
  14. Publish A Site Publish site header (a map of pages

    for a site) Publish routing table Publish site header / routes Editor Segment Public Segment
  15. Built For Speed Minimize out-of-service hops (2 DB, 1 RPC)

    Lookup tables are cached in memory, updated every 5 minutes Denormalized data – optimize for read by primary key (MySQL) Minimize business logic
  16. How a Page Gets Rendered Bootstrap HTML template that contains

    only data Only JavaScript imports JSON data (site-header + dynamic data) No “real” HTML view
  17. The average Intel Core i750 can push up to 7

    GFLOPS without overclocking
  18. Why JSON? Easy to parse in JavaScript and Java/Scala Fairly

    compact text format Highly compressible (5:1 even for small payloads) Easy to fix rendering bugs (just deploy a new client code)
  19. Serving a Site – Sunny Day Archive CDN Statics Browser

    http://example.wix.com Store HTML to cache HTTP Request Notify site view LB Public Renderer HTML Resources / Media HTTP Request
  20. Serving a Site – DC Lost Archive CDN Statics Browser

    http://example.wix.com LB Public Renderer LB Public Renderer Change DNS HTTP Request
  21. Serving a Site – Public Lost Archive CDN Statics Browser

    http://example.wix.com LB Public Renderer Get Cached HTML Version HTML HTTP Request
  22. Living in the Browser Archive CDN Statics Browser http://example.wix.com LB

    Public Renderer Editor Fallback JSON / Media HTML HTTP Request Fallback
  23. Summary Identify your critical path and concerns Build redundancy in

    critical path (for availability) De-normalize data (for performance) Minimize out-of-process hops (for performance) Take advantage of client’s CPU power