Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Thanos - Prometheus at Scale (OS Summit EU)

55cdaff2951048a85f370467b324ed2e?s=47 Bartek
October 22, 2018

Thanos - Prometheus at Scale (OS Summit EU)

https://osseu18.sched.com/event/FxWY/bof-thanos-high-availability-and-long-term-storage-of-prometheus-metrics-bartek-plotka-improbable

Prometheus has been thriving for several years. However, some questions were still largely unaddressed to this day. How can we store historical data at the order of petabytes in a reliable and cost-efficient way? Can we do so without sacrificing responsive query times? And what about a global view of all our metrics and transparent handling of HA setups?

Thanos takes Prometheus' strong foundations and extends it into a clustered, yet coordination free, globally scalable metric system. It retains Prometheus’s simple operational model and even simplifies deployments further. Under the hood, Thanos uses highly cost-efficient object storage that’s available in virtually all environments today. By building directly on top of the storage format introduced with Prometheus 2.0, Thanos achieves near real-time responsiveness even for cold queries against historical data.

55cdaff2951048a85f370467b324ed2e?s=128

Bartek

October 22, 2018
Tweet

Transcript

  1. Bartek Płotka Prometheus at Scale Edinburgh, 22th October 2018 github.com/improbable-eng/thanos

    @bwplotka
  2. Bartek Płotka Software Engineer bartek@improbable.io @bwplotka

  3. 310 Founded: Games in Development: +19 2012 Employees: "Improbable’s platform,

    SpatialOS, is designed to let anyone build massive simulations, running in the cloud: imagine Minecraft with thousands of players in the same space or researchers creating simulated cities to model the behaviour of millions. Its ultimate goal: to create totally immersive, persistent virtual worlds." - WIRED, May 2017
  4. @bwplotka Agenda Prometheus

  5. @bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus

  6. @bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus Thanos

  7. @bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus Thanos +

    + + + +
  8. @bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus Thanos +

    + + + + Thanos + Open Source
  9. Prometheus https://github.com/prometheus/prometheus

  10. @bwplotka Prometheus Query Engine Scrape Engine Compactor Rule & Alert

    Engine Prometheus Service X Service X Services /metrics every 15s HTTP Query API Grafana Alertmanager Local storage
  11. @bwplotka Cluster Single Prometheus Prometheus Grafana Alertmanager Workloads you want

    to monitor
  12. @bwplotka Cluster Single Prometheus Prometheus Grafana Alertmanager Workloads you want

    to monitor 200 000 samples / sec = 1 CPU, 13 GB RAM = 3 mln series with 15s scrape interval.
  13. @bwplotka Cluster High Availability Prometheus Grafana Alertmanager Workloads you want

    to monitor Prometheus Prometheus
  14. @bwplotka Cluster 1 Globally distributed clusters Cluster 2 Prometheus Cluster

    n Cluster n+1 Prometheus ...
  15. @bwplotka Problem: Global View Prometheus Prometheus sum(rate(go_memstats_alloc_bytes_total[1m])) ? source =

    “a” source = “b”
  16. @bwplotka go_memstats_alloc_bytes_total{..., replica=A) Problem: Global View + HA Prometheus Prometheus

    go_memstats_alloc_bytes_total{..., replica=B) go_memstats_alloc_bytes_total{...} [deduplicated]
  17. @bwplotka Problem: Metric retention

  18. @bwplotka Problem: Metric retention SSD Prometheus Prometheus Remote write

  19. @bwplotka Thanos Goals • Have a global view • Have

    a HA in place • Increase retention
  20. Global View See everything from a single place!

  21. @bwplotka SSD Prometheus Prometheus Targets

  22. @bwplotka SSD Sidecar Prometheus Sidecar Targets gRPC (Store API)

  23. @bwplotka Store API service Store { rpc Series(SeriesRequest) returns (stream

    SeriesResponse); rpc LabelNames(LabelNamesRequest) returns (LabelNamesResponse); rpc LabelValues(LabelValuesRequest) returns (LabelValuesResponse); } message SeriesRequest { int64 min_time = 1; int64 max_time = 2; repeated LabelMatcher matchers = 3; } Sidecar Prometheus remote read Store API
  24. @bwplotka SSD Querier Prometheus Sidecar Querier Store API Targets HTTP

    Query API
  25. @bwplotka SSD Global View Prometheus Sidecar Querier Targets SSD Sidecar

    Targets Prometheus Merge Store API
  26. @bwplotka SSD Global View + Availability Prometheus Sidecar Targets SSD

    Sidecar Targets Prometheus SSD Sidecar Prometheus “replica”:”A” “replica”:”B” Querier Merge Deduplicate Store API
  27. @bwplotka Thanos Goals • Have a global view ✓ •

    Have a HA in place ✓ Prometheus Sidecar SSD Sidecar Prometheus Sidecar Prometheus Querier
  28. @bwplotka Global View + Availability

  29. @bwplotka Global View + Availability

  30. Historical Metrics What exactly happened X months ago?

  31. @bwplotka Prometheus Query Engine Scrape Engine Compactor Rule & Alert

    Engine Prometheus TSDB
  32. @bwplotka TSDB Layout Block 4 Block 3 Block 1 chunks

    chunks chunks chunks index T-10h T-16h T-4h T-2h T
  33. @bwplotka SSD Data saving Prometheus Sidecar Targets Object Storage Blocks

    Blocks Block --gcs.bucket=...
  34. @bwplotka SSD Data saving Prometheus Sidecar Targets Object Storage Blocks

    Blocks Block --storage.tsdb.max-block-duration=2h --storage.tsdb.retention=12h
  35. @bwplotka Store Gateway Object Storage Blocks Cache Store Gateway Querier

    Store API
  36. @bwplotka Object Storage Blocks Ruler Ruler Querier Store API Query

    SSD Blocks
  37. Compaction Density matters

  38. @bwplotka Compaction Object Storage Blocks Disk Compactor Block Blocks Block

  39. Downsampling

  40. @bwplotka Downsampling Raw: 16 bytes/sample Compressed: 1.07 bytes/sample

  41. @bwplotka Downsampling Decompressing one sample takes 10-40 nanoseconds.

  42. @bwplotka Downsampling Decompressing one sample takes 10-40 nanoseconds. Decompressing 1000

    series @ 30s scrape interval for 1 year data takes 10-40 seconds alone. Plus your actual computation over all those samples, e.g. rate()
  43. @bwplotka Downsampling Block RAW Block @ 5m Block @ 1h

    10x 12x
  44. @bwplotka Thanos Goals • Have a global view ✓ •

    Have a HA in place ✓ • Increase retention ✓
  45. @bwplotka Prometheus Query Engine Scrape Engine Compactor Rule & Alert

    Engine Prometheus
  46. @bwplotka Thanos Scrape Engine Compactor Rule & Alert Engine Thanos

    Querier Thanos Querier Thanos Querier
  47. @bwplotka Thanos Compactor Rule & Alert Engine Thanos Querier Thanos

    Querier Thanos Querier SSD Prometheus Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar
  48. @bwplotka Thanos Compactor Thanos Querier Thanos Querier Thanos Querier SSD

    Prometheus Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar Thanos Ruler Thanos Ruler
  49. @bwplotka Thanos Thanos Ruler Thanos Ruler Thanos Querier Thanos Querier

    Thanos Querier SSD Prometheus Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar Global Compactor
  50. @bwplotka Thanos Store Gateway Store Gateway Object Storage SSD Prometheus

    Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar Thanos Ruler Thanos Ruler Global Compactor Thanos Querier Thanos Querier Thanos Querier
  51. @bwplotka Example Deployment Cluster 1 Cluster 2 + Cluster n

    Cluster n+1 + ... + Monitoring Cluster Grafana Alertmanager Bucket Compactor Querier Querier Querier Ruler Store Statically configured / global discovery + + +
  52. Open Source & Thanos

  53. Open Source: It is worth it!

  54. Open source from the very beginning

  55. Be flexible while staying focused

  56. Stay focused: Injection API Source: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<file_sd_config>

  57. Less magic == better

  58. @bwplotka Avoid magic: Gossip example • Easy to misconfigure •

    Hard to debug • Difficult cross cluster setup • Over-complicated for Thanos needs
  59. @bwplotka Thanos Goals • Have a global view ✓ •

    Have a HA in place ✓ • Increase retention ✓ • Join effort with community ✓
  60. @bwplotka Summary • Start small with just single Prometheus. •

    Extend your setup gradually. ◦ Just global querier. ◦ Federated global querier. ◦ Object storage. • Model your Thanos deployment. Generally: Share early, keep project focused and simple.
  61. github.com/improbable-eng/thanos Bartek Płotka @bwplotka improbable.io ...psst, join our slack workspace

    for more info! Credits: • Percona blog post about Prometheus 2 perf • Emojis designed by Freepik from Flaticon