Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Blizzard: Building a Near Real-Time Data Pipeline

Blizzard: Building a Near Real-Time Data Pipeline

Learn how Blizzard Entertainment -- makers of Overwatch and World of Warcraft -- leverages Elasticsearch, Kibana, Logstash, Kafka, tribes, and Node.js to generate actionable value from gamer and server events.

Chris Burkhart l Technical Lead, Data Team l Blizzard Entertainment
Jordan Irwin l Technical Lead, Battle.net Engineering Systems l Blizzard Entertainment

Elastic Co

March 08, 2017
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. Blizzard Entertainment March 8th, 2017 @jordanirwin / @ctide Building a

    Near Real-Time Pipeline for All Things Blizzard Jordan Irwin / Chris Burkhart, Technical Leads
  2. Is This Easy Mode? ??? Data Step 1: Generate Good

    Data Step 2: Collect and Analyze Step 3: Profit!
  3. Quest List • Brief history of Big Data at Blizzard

    – Where it began – The world could use more hero…ic data – Glimpse into our future • Elastic Stack: GG or OMG? • Lessons learned • Tidbits
  4. A long long time ago… • Protocol Buffers as IDL

    • Server data only • Publish directly to RMQ • Federation galore • Map/Reduce all the things • Standard “data lake” approach
  5. Back Then Game Server RMQ Flume Hadoop Game Client API

    * Limited client support eventually added… Map/Reduce
  6. The Good Parts • From zero to hero: It worked!

    • Data driven decisions now possible • Positive “data culture” formed • Protocol Buffers well established • Foundation for Big Data at Blizzard
  7. The Bad Parts • Schemas coordinated via emails (if at

    all) • Map/Reduce requires specialized expertise • More effort preparing data vs analyzing it • RMQ scaling became non-trivial
  8. You must construct additional pylons Some goals… • ~20 billion

    messages/day • Schema Registry • Collect data from anywhere • “Free the data”
  9. Road to Overwatch Game Server SDK Kafka Hadoop Elasticsearch Game

    Client Git repo for Schemas Map/Reduce Tribe API Metrics Logs Logstash Logstash Kibana
  10. Immediate Winz • Client data meant new ways to debug

    – CCU Drops tied to ISPs – Network Quality reports – Measurable customer impact • Even better than server monitoring! • Centralized log searching – RIP grep • All in near real-time!
  11. The Good Parts • Elasticsearch + Kibana accessibility – Single

    “pane of glass” – Easy to use – Instant data • “Free the data” worked • Much higher scalability • SDKs for multiple languages/platforms – C#, C++ (PC/Xbox/PS4) • Offered a schema storage place • The business LOVED IT
  12. The Bad Parts • Schemas not required and avoided –

    Not really a true registry – Dynamic mapping nightmares – Converted “data lake” into “data swamp” • Map/Reduce all-the-things still a problem • Tribe Node instability meant frequent outages • Metrics solution wasn’t scalable (ingest) • Logging wasn’t sustainable (configuration) We needed a bigger boat...
  13. MOAR PYLONS! Reconsidered goals… • ~100 billion messages/day • Schema

    Registry revisited and required • Collect data from anywhere • “Free the data” even more • Easy to onboard • Dogfood everything
  14. Today’sh Kafka Hadoop SDK Schema Registry … Kibana API Enrich

    Kafka Game Client Elasticsearch Tribe* Game Server Logs Metrics TDK
  15. The Good Parts • Required and robust Schema Registry –

    "What You Registered Is What You Get” (WYRWYG) • Telemetry Development Kit (TDK) • Improved and expanded SDKs – C#, C++ (PC/Mac/Xbox/PS4), Python, Java, NodeJS, Android, Unity, Go* • Documentation prioritization • Telem-Telem: Dog food is tasty • Stable Tribe Nodes (Thanks Elastic!) • Extendible for more features • Less map/reduce, moar insight
  16. The Bad Parts • Deprecated metrics support (for now) •

    Limited logging support • Dozens of global Elasticsearch clusters constituting a single system isn’t trivial (but still possible!) – Monitoring – Logging – Auditing – JVM GC – Updates…
  17. 21

  18. Future • Upgrade to 5.x Elastic Stack (/shivers) • Logging

    4realz • Metrics 4sho (w/rollup!) • Custom transforms • Subscriptions • ODBC/JDBC • Machine Learning • … and much, much more!
  19. The Good Parts • Leverages existing foundation • Low risk

    updates allows major features • Pairs tools with access patterns • Favors extensibility
  20. Think Globally • Proven architecture • Vetted by influential companies

    • Best parts of popular pipelines • Blizzard will be a global leader in Big Data, Soon™
  21. GG Elastic • “Free the data” contributor • Kibana makes

    data accessible • Tribe Nodes centralize data • Aliases abstract index names • Fast time to insight • APIs allow tooling • Shield controls access • Communication with Elastic has been great!
  22. OMG Elastic • Shield can get complicated • Kibana multi-tenancy

    needs loves • Tribes are great… when they work • Logs can be spammy • Auditing gaps (who did what?) • Bad actors can ruin the fun
  23. We were not prepared • Take schema management seriously •

    Let use-cases drive development • Expect success • Get data flowing ASAP
  24. Data Data • Message Rate – Billions/day • Elasticsearch Storage

    – Hundreds of Terabytes • HDFS Storage – Petabytes So sorry no real details L
  25. Shameless “Plug” • Using NodeJS with Kafka? – We open

    sourced node-rdkafka – https://github.com/Blizzard/node-rdkafka • We’re Hiring! – Know someone? – Am someone? – Java / Scala / Kafka / Hadoop / Big Data – http://careers.blizzard.com