Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Metrics at Uber, Monitorama 2018

Metrics at Uber, Monitorama 2018

Prateek Rungta

June 07, 2018
Tweet

Other Decks in Technology

Transcript

  1. Metrics at Uber Prateek Rungta (@prateekrungta) Engineer, M3 Team Learnings,

    a few neat Observability Patterns and our OSS metrics platform
  2. Uber’s Architecture & Metrics - ~4K Microservices - Central Observability

    platform, focus on Metrics today - Tracing: Yuri’s talk about Jaeger, Monitorama 2017 - Used for all manner of things - Capacity Planning using System Metrics (e.g. Load Average) - Real-time Alerting using Application metrics (e.g. p99 response time for ride requests) - Tracking business metrics (e.g. number of UberX riders in Portland) - … and plenty more …
  3. Developers! Developers! Developers! func myRPCHandler(param int, m MetricScope) { …

    t := m.Timer(“latency”).Start() responseCode := client.Call(param) t.Stop() m.Tagged(map[string]string{“code”:responseCode}).Counter(“response”).Inc(1) }
  4. “Golden Signals” Usually you want the same telemetry - SRE

    Book: Latency, traffic, errors, and saturation - USE Method: Utilisation, saturation, and errors - RED Method: Rate, errors, and duration - Shout out for Baron-Schwartz’s work: video
  5. End User Code (“Biz logic”) RPC Storage (C*/Redis/…) ... Library

    owners: - Dashboard panel template = f(serviceName) - Ensure library emits metrics following given template Application devs: - Service: uses library - Provide “serviceName” at time of generation
  6. - Grafana : Dynamic Dashboard :: Manual Configured Alerts :

    ? - E.g. Detect anomalies in latency per RPC endpoint Auto Alerting
  7. Scale - Ingress ~ 6B Unique Metric IDs (random week

    when I was making these slides)
  8. Scale - Egress ~ 2.2K Queries per second (9K Grafana

    Dashboards, 150K Realtime Alerts) (random week when I was making these slides)
  9. - Persisted Metrics: 20% uptick in the last quarter -

    Unique IDs: 50% uptick in the last half year - QPS: 100% uptick in the last year - Ingress Traffic: 900x in the last 3 years Constantly growing
  10. A brief history of M3 - 2014-2015: Graphite - No

    replication, operations were ‘cumbersome’ - 2015-2016: Cassandra - 16x YoY Growth - Expensive (>1500 Cassandra hosts) - “Technology Telemetry company” - Compactions ⇒ RF=2 ⇒ Repairs too slow - 2016-Today: M3DB
  11. M3DB A open source distributed time series database - Store

    arbitrary timestamp precision datapoints at any resolution for any retention - Optimized file-system storage with no need for compactions - Replicated with zone/rack aware layout and configurable replication factor - Strongly consistent cluster membership backed by etcd - Fast streaming for node add/replace/remove by selecting best peer for a series while also repairing any mismatching series at time of streaming
  12. - m3tsz = tsz + improvements - More details to

    follow in a blog, for the curious – https://github.com/m3db/m3db/tree/master/src/dbnode/encoding/m3tsz M3TSZ Overview TSZ M3TSZ Improvement Number of bytes / datapoint 2.42 Compression ratio 6.56x Encoding time (ns) / datapoint 338 Decoding time (ns) / datapoint 347 1.45 40% 11x 40% 298 12% 300 14% These results apply the two different algorithms on Uber’s production data
  13. M3TSZ Impact - Data volumes at time of migration end

    of 2016 ◦ Disk usage ~ 1.4PB for Cassandra at RF=2 ◦ Disk usage ~ 200TB for M3DB at RF=3
  14. Persistence • For each incoming write ◦ Data is stored

    in memory in compressed ‘n’-hour blocks, ◦ Data is appended to commit log on disk (think WAL), • We periodically write the compressed blocks to disk as immutable fileset files (think Snapshot file)
  15. Layout on Disk ────────────────────────────── Time ──────────────────────────────────────── ▶ ┌──────────────────────────┐ │/var/lib/m3db/commitlogs/ │

    └───────────────────┬──────┴─┬────────┬────────┬────────┬────────┬────────┐ │Commit │Commit │Commit │Commit │Commit │Commit │ │Log File│Log File│Log File│Log File│Log File│Log File│ └────────┴────────┴────────┴────────┴────────┴────────┘ ┌──────────────────────────────────────┐ │/var/lib/m3db/data/namespace-a/shard-0│ └───────────────────┬───────────────┬──┴────────────┬───────────────┬───────────────┐ │Fileset File │Fileset File │Fileset File │Fileset File │ │Block │Block │Block │Block │ └───────────────┴───────────────┴───────────────┴───────────────┘ ┌──────────────────────────────────────┐ │/var/lib/m3db/index/namespace-a │ └───────────────────┬──────────────────┴────────────┬───────────────────────────────┐ │Index Fileset File │Index Fileset File │ │Block │Block │ └───────────────────────────────┴───────────────────────────────┘
  16. Filesets Files • Data is flushed from memory to disk

    every ‘n’ hours as block filesets • Two flavours: ◦ Data fileset blocks contain compressed time-series data (m3tsz) ◦ Index fileset blocks contain compressed reverse-indexing data (FSTs/Postings Lists/etc) • Expired block filesets are periodically cleaned up in the background
  17. Commit Log • Uncompressed • Support sync and async writes

    ◦ Async for performance: buffer in memory & periodically flush batches
  18. - Strongly consistent topology (using etcd) - Consistency managed via

    synchronous quorum writes and reads - Configurable consistency level - No hinted hand-off - Nodes bootstrap from peers at startup/topology-change Topology & Consistency
  19. • Increased replication ◦ 2 -> 3x replication factor •

    Read Performance Improvements(p50/p95/p99): - C*: 8ms / 270ms / 500ms - M3DB: 0.2ms / 0.35ms / 5ms • Cheaper(!) M3DB Impact
  20. Read cache Write cache What’s production look like today? Host

    Collector Client Client Host Collector Client Client Host Collector Client Client Query Service M3DB M3DB ES 5.x Aggregation Tier Indexer M3DB Ingester (per region)
  21. OSS

  22. - Coordinator & Index used in smaller deployments - Feature

    work to use with Multiple M3DB Cluster deployments (like Uber’s production usage) - Index Read Performance Improvements Caveat Emptor Index Coordinator
  23. Where - All development on: github.com/m3db/m3db - Apache v2 -

    Contributions welcome! - Documentation: http://bit.ly/m3db-docs - Reach us via: http://bit.ly/m3db-forums
  24. What’s to come - M3DB: - Lookout for a blog

    post to drop in July - Ability to backfill data - Index Performance + Multi-clustered Index - Graphite Support for M3Coordinator - … and plenty more … - Aggregator: github.com/m3db/m3aggregator - Packaging, Documentation, etc. - Query Engine (and Query Language) - … and plenty more …
  25. - Code: github.com/m3db/m3db - Docs: http://bit.ly/m3db-docs - Forum: http://bit.ly/m3db-forums -

    Slides: http://bit.ly/m3db-monitorama2018 Thank you! @prateekrungta