Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch & 63 Million WordPress Sites

xyu
February 06, 2014

Elasticsearch & 63 Million WordPress Sites

Overview of the Elasticsearch infrastructure that Automattic maintains to support WordPress.com.

xyu

February 06, 2014
Tweet

More Decks by xyu

Other Decks in Technology

Transcript

  1. Cluster Stats • 63M Sites • 743M Documents • 12TB

    Primary + Replicas • 51M Query Ops / Day • 15M Index Ops / Day 2 Major Use Cases • Global Search ! • Local Search Elasticsearch + WordPress.com
  2. Infrastructure Layout Internal API Cache REST API PHP Node 1

    Node 2 Cluster A Node 1 Node 2 Node 3 Node n Cluster B Stats
  3. Documents & Types /index/post {
 blog_id: 123,
 post_id: 456,
 title:

    "Search!",
 content: "…",
 blog: {
 lang: "en",
 …
 },
 …
 } /index/blog {
 blog_id: 123,
 url: "www.xyu.io",
 follower_ids: [
 789,
 …
 ],
 lang: "en",
 indexable: true,
 …
 }
  4. Storage Strategy • Grow Number of Indices
 (10M Sites /

    Index) • 25 Shards / Index
 (400K Sites / Shard) • 3 Copies of Data
 (1 Primary + 2 Replicas) 2 Major Use Cases • Global Search • Query All Shards • Local Search • Query One Shard Indicies & Shards