Thanos - Prometheus at Scale (OS Summit EU)

Bartek Płotka Prometheus at Scale Edinburgh, 22th October 2018 github.com/improbable-eng/thanos
@bwplotka

Bartek Płotka Software Engineer [email protected] @bwplotka

310 Founded: Games in Development: +19 2012 Employees: "Improbable’s platform,
SpatialOS, is designed to let anyone build massive simulations, running in the cloud: imagine Minecraft with thousands of players in the same space or researchers creating simulated cities to model the behaviour of millions. Its ultimate goal: to create totally immersive, persistent virtual worlds." - WIRED, May 2017

@bwplotka Agenda Prometheus

@bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus

@bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus Thanos

@bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus Thanos +
+ + + +

@bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus Thanos +
+ + + + Thanos + Open Source

Prometheus https://github.com/prometheus/prometheus

@bwplotka Prometheus Query Engine Scrape Engine Compactor Rule & Alert
Engine Prometheus Service X Service X Services /metrics every 15s HTTP Query API Grafana Alertmanager Local storage

@bwplotka Cluster Single Prometheus Prometheus Grafana Alertmanager Workloads you want
to monitor

@bwplotka Cluster Single Prometheus Prometheus Grafana Alertmanager Workloads you want
to monitor 200 000 samples / sec = 1 CPU, 13 GB RAM = 3 mln series with 15s scrape interval.

@bwplotka Cluster High Availability Prometheus Grafana Alertmanager Workloads you want
to monitor Prometheus Prometheus

@bwplotka Cluster 1 Globally distributed clusters Cluster 2 Prometheus Cluster
n Cluster n+1 Prometheus ...

@bwplotka Problem: Global View Prometheus Prometheus sum(rate(go_memstats_alloc_bytes_total[1m])) ? source =
“a” source = “b”

@bwplotka go_memstats_alloc_bytes_total{..., replica=A) Problem: Global View + HA Prometheus Prometheus
go_memstats_alloc_bytes_total{..., replica=B) go_memstats_alloc_bytes_total{...} [deduplicated]

@bwplotka Problem: Metric retention

@bwplotka Problem: Metric retention SSD Prometheus Prometheus Remote write

@bwplotka Thanos Goals • Have a global view • Have
a HA in place • Increase retention

Global View See everything from a single place!

@bwplotka SSD Prometheus Prometheus Targets

@bwplotka SSD Sidecar Prometheus Sidecar Targets gRPC (Store API)

@bwplotka Store API service Store { rpc Series(SeriesRequest) returns (stream
SeriesResponse); rpc LabelNames(LabelNamesRequest) returns (LabelNamesResponse); rpc LabelValues(LabelValuesRequest) returns (LabelValuesResponse); } message SeriesRequest { int64 min_time = 1; int64 max_time = 2; repeated LabelMatcher matchers = 3; } Sidecar Prometheus remote read Store API

@bwplotka SSD Querier Prometheus Sidecar Querier Store API Targets HTTP
Query API

@bwplotka SSD Global View Prometheus Sidecar Querier Targets SSD Sidecar
Targets Prometheus Merge Store API

@bwplotka SSD Global View + Availability Prometheus Sidecar Targets SSD
Sidecar Targets Prometheus SSD Sidecar Prometheus “replica”:”A” “replica”:”B” Querier Merge Deduplicate Store API

@bwplotka Thanos Goals • Have a global view ✓ •
Have a HA in place ✓ Prometheus Sidecar SSD Sidecar Prometheus Sidecar Prometheus Querier

@bwplotka Global View + Availability

Historical Metrics What exactly happened X months ago?

Engine Prometheus TSDB

@bwplotka TSDB Layout Block 4 Block 3 Block 1 chunks
chunks chunks chunks index T-10h T-16h T-4h T-2h T

@bwplotka SSD Data saving Prometheus Sidecar Targets Object Storage Blocks
Blocks Block --gcs.bucket=...

@bwplotka SSD Data saving Prometheus Sidecar Targets Object Storage Blocks
Blocks Block --storage.tsdb.max-block-duration=2h --storage.tsdb.retention=12h

@bwplotka Store Gateway Object Storage Blocks Cache Store Gateway Querier
Store API

@bwplotka Object Storage Blocks Ruler Ruler Querier Store API Query
SSD Blocks

Compaction Density matters

@bwplotka Compaction Object Storage Blocks Disk Compactor Block Blocks Block

Downsampling

@bwplotka Downsampling Raw: 16 bytes/sample Compressed: 1.07 bytes/sample

@bwplotka Downsampling Decompressing one sample takes 10-40 nanoseconds.

@bwplotka Downsampling Decompressing one sample takes 10-40 nanoseconds. Decompressing 1000
series @ 30s scrape interval for 1 year data takes 10-40 seconds alone. Plus your actual computation over all those samples, e.g. rate()

@bwplotka Downsampling Block RAW Block @ 5m Block @ 1h
10x 12x

Have a HA in place ✓ • Increase retention ✓

Engine Prometheus

@bwplotka Thanos Scrape Engine Compactor Rule & Alert Engine Thanos
Querier Thanos Querier Thanos Querier

@bwplotka Thanos Compactor Rule & Alert Engine Thanos Querier Thanos
Querier Thanos Querier SSD Prometheus Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar

@bwplotka Thanos Compactor Thanos Querier Thanos Querier Thanos Querier SSD
Prometheus Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar Thanos Ruler Thanos Ruler

@bwplotka Thanos Thanos Ruler Thanos Ruler Thanos Querier Thanos Querier
Thanos Querier SSD Prometheus Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar Global Compactor

@bwplotka Thanos Store Gateway Store Gateway Object Storage SSD Prometheus
Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar Thanos Ruler Thanos Ruler Global Compactor Thanos Querier Thanos Querier Thanos Querier

@bwplotka Example Deployment Cluster 1 Cluster 2 + Cluster n
Cluster n+1 + ... + Monitoring Cluster Grafana Alertmanager Bucket Compactor Querier Querier Querier Ruler Store Statically configured / global discovery + + +

Open Source & Thanos

Open Source: It is worth it!

Open source from the very beginning

Be flexible while staying focused

Stay focused: Injection API Source: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#<file_sd_config>

Less magic == better

@bwplotka Avoid magic: Gossip example • Easy to misconfigure •
Hard to debug • Difficult cross cluster setup • Over-complicated for Thanos needs

Have a HA in place ✓ • Increase retention ✓ • Join effort with community ✓

@bwplotka Summary • Start small with just single Prometheus. •
Extend your setup gradually. ◦ Just global querier. ◦ Federated global querier. ◦ Object storage. • Model your Thanos deployment. Generally: Share early, keep project focused and simple.

github.com/improbable-eng/thanos Bartek Płotka @bwplotka improbable.io ...psst, join our slack workspace
for more info! Credits: • Percona blog post about Prometheus 2 perf • Emojis designed by Freepik from Flaticon

Thanos - Prometheus at Scale (OS Summit EU)

Thanos - Prometheus at Scale (OS Summit EU)

More Decks by Bartek

Other Decks in Technology

Featured

Transcript