Thanos - Prometheus at Scale

Slide 1

Slide 1 text

Bartek Płotka Prometheus at Scale London, 26th September 2018 github.com/improbable-eng/thanos @bwplotka

Slide 2

Slide 2 text

Bartek Płotka Software Engineer [email protected] @bwplotka

Slide 3

Slide 3 text

310 Founded: Games in Development: +19 2012 Employees: "Improbable’s platform, SpatialOS, is designed to let anyone build massive simulations, running in the cloud: imagine Minecraft with thousands of players in the same space or researchers creating simulated cities to model the behaviour of millions. Its ultimate goal: to create totally immersive, persistent virtual worlds." - WIRED, May 2017

Slide 4

Slide 4 text

@bwplotka Agenda Prometheus

Slide 5

Slide 5 text

@bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus

Slide 6

Slide 6 text

@bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus Thanos

Slide 7

Slide 7 text

@bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus Thanos + + + + +

Slide 8

Slide 8 text

@bwplotka Agenda Prometheus Prometheus Prometheus Prometheus Prometheus Prometheus Thanos + + + + + + + + +

Slide 9

Slide 9 text

@bwplotka Prometheus Query Engine Scrape Engine Compactor Rule & Alert Engine Prometheus Service X Service X Services /metrics every 15s HTTP Query API Grafana Alertmanager Local storage

Slide 10

Slide 10 text

@bwplotka Cluster Single Prometheus Prometheus Grafana Alertmanager Workloads you want to monitor

Slide 11

Slide 11 text

@bwplotka Cluster Single Prometheus Prometheus Grafana Alertmanager Workloads you want to monitor 200 000 samples / sec = 1 CPU, 13 GB RAM = 3 mln series with 15s scrape interval.

Slide 12

Slide 12 text

@bwplotka Cluster High Availability Prometheus Grafana Alertmanager Workloads you want to monitor Prometheus Prometheus

Slide 13

Slide 13 text

@bwplotka Cluster 1 Globally distributed clusters Cluster 2 Prometheus Cluster n Cluster n+1 Prometheus ...

Slide 14

Slide 14 text

@bwplotka Problem: Global View Prometheus Prometheus sum(rate(go_memstats_alloc_bytes_total[1m])) ? source = “a” source = “b”

Slide 15

Slide 15 text

@bwplotka Problem: Global View Prometheus Prometheus sum(go_memstats_alloc_bytes_total::rate1m) ✓/? Prometheus /federate /federate source = “a” source = “b” every 15s

Slide 16

Slide 16 text

@bwplotka go_memstats_alloc_bytes_total{..., replica=A) Problem: Global View + HA Prometheus Prometheus go_memstats_alloc_bytes_total{..., replica=B) go_memstats_alloc_bytes_total{...} [deduplicated]

Slide 17

Slide 17 text

@bwplotka Problem: Metric retention

Slide 18

Slide 18 text

@bwplotka Problem: Metric retention SSD Prometheus Prometheus Remote write

Slide 19

Slide 19 text

@bwplotka Thanos Goals ● Have a global view ● Have a HA in place ● Increase retention

Slide 20

Slide 20 text

Global View See everything from a single place!

Slide 21

Slide 21 text

@bwplotka SSD Prometheus Prometheus Targets

Slide 22

Slide 22 text

@bwplotka SSD Sidecar Prometheus Sidecar Targets gRPC (Store API) spec: containers: - image: improbable/thanos:v0.1.0 command: - thanos args: - sidecar - --http-address=0.0.0.0:19190 - --grpc-address=0.0.0.0:19090 - --tsdb.path=/prometheus-data - --prometheus.url=http://localhost:9090

Slide 23

Slide 23 text

@bwplotka Store API service Store { rpc Series(SeriesRequest) returns (stream SeriesResponse); rpc LabelNames(LabelNamesRequest) returns (LabelNamesResponse); rpc LabelValues(LabelValuesRequest) returns (LabelValuesResponse); } message SeriesRequest { int64 min_time = 1; int64 max_time = 2; repeated LabelMatcher matchers = 3; } Sidecar Prometheus remote read Store API

Slide 24

Slide 24 text

@bwplotka SSD Querier Prometheus Sidecar Querier Store API Targets HTTP Query API

Slide 25

Slide 25 text

@bwplotka SSD Global View Prometheus Sidecar Querier Targets SSD Sidecar Targets Prometheus Merge Store API

Slide 26

Slide 26 text

@bwplotka SSD Global View + Availability Prometheus Sidecar Targets SSD Sidecar Targets Prometheus SSD Sidecar Prometheus “replica”:”A” “replica”:”B” Querier Merge Deduplicate Store API

Slide 27

Slide 27 text

@bwplotka Thanos Goals ● Have a global view ✓ ● Have a HA in place ✓ Prometheus Sidecar SSD Sidecar Prometheus Sidecar Prometheus Querier spec: containers: - image: improbable/thanos:v0.1.0 command: - thanos args: - query - --http-address=0.0.0.0:9090 - --grpc-address=0.0.0.0:19090 - --query.replica-label=replica - --store=sidecar1.default.svc:19090 - --store=sidecar2-a.default.svc:19090 - --store=sidecar2-b.default.svc:19090

Slide 28

Slide 28 text

@bwplotka Global View + Availability

Slide 29

Slide 29 text

@bwplotka Global View + Availability

Slide 30

Slide 30 text

Historical Metrics What exactly happened X months ago?

Slide 31

Slide 31 text

@bwplotka TSDB Layout Block 4 Block 3 Block 1 chunks chunks chunks chunks index T-10h T-16h T-4h T-2h T

Slide 32

Slide 32 text

@bwplotka SSD Data saving Prometheus Sidecar Targets Object Storage Blocks Blocks Block --gcs.bucket=...

Slide 33

Slide 33 text

@bwplotka SSD Data saving Prometheus Sidecar Targets Object Storage Blocks Blocks Block --storage.tsdb.max-block-duration=2h --storage.tsdb.retention=12h

Slide 34

Slide 34 text

@bwplotka Store Gateway Object Storage Blocks Cache Store Querier Store API spec: containers: - image: improbable/thanos:v0.1.0 command: - thanos args: - store - --http-address=0.0.0.0:19190 - --grpc-address=0.0.0.0:19090 - --store-dir=/store-data - --gcs.bucket=... - --index-cache-size=1GB - --chunk-pool-size=8GB

Slide 35

Slide 35 text

@bwplotka spec: containers: - image: improbable/thanos:v0.1.0 command: - thanos args: - rule - --http-address=0.0.0.0:19190 - --grpc-address=0.0.0.0:19090 - --data-dir=/rule-data - --gcs.bucket=... - --rule-file=... - --query-url=... Object Storage Blocks Ruler Ruler Querier Store API Query SSD Blocks

Slide 36

Slide 36 text

Compaction Density matters

Slide 37

Slide 37 text

@bwplotka Compaction Object Storage Blocks Disk Compactor

Slide 38

Slide 38 text

@bwplotka Compaction Object Storage Blocks Disk Compactor Block spec: containers: - image: improbable/thanos:v0.1.0 command: - thanos args: - compact - --http-address=0.0.0.0:80 - --data-dir=/compactor-data - --gcs.bucket=... - --wait Blocks Block

Slide 39

Slide 39 text

Downsampling

Slide 40

Slide 40 text

@bwplotka Downsampling Raw: 16 bytes/sample Compressed: 1.07 bytes/sample

Slide 41

Slide 41 text

@bwplotka Downsampling Decompressing one sample takes 10-40 nanoseconds.

Slide 42

Slide 42 text

@bwplotka Downsampling Decompressing one sample takes 10-40 nanoseconds. Decompressing 1000 series @ 30s scrape interval for 1 year data takes 10-40 seconds alone. Plus your actual computation over all those samples, e.g. rate()

Slide 43

Slide 43 text

@bwplotka Downsampling Block RAW Block @ 5m Block @ 1h 10x 12x

Slide 44

Slide 44 text

@bwplotka Thanos Goals ● Have a global view ✓ ● Have a HA in place ✓ ● Increase retention ✓

Slide 45

Slide 45 text

@bwplotka Prometheus Query Engine Scrape Engine Compactor Rule & Alert Engine Prometheus

Slide 46

Slide 46 text

@bwplotka Thanos Scrape Engine Compactor Rule & Alert Engine Thanos Querier Thanos Querier Thanos Querier

Slide 47

Slide 47 text

@bwplotka Thanos Compactor Rule & Alert Engine Thanos Querier Thanos Querier Thanos Querier SSD Prometheus Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar

Slide 48

Slide 48 text

@bwplotka Thanos Compactor Thanos Querier Thanos Querier Thanos Querier SSD Prometheus Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar Thanos Ruler Thanos Ruler

Slide 49

Slide 49 text

@bwplotka Thanos Thanos Ruler Thanos Ruler Thanos Querier Thanos Querier Thanos Querier SSD Prometheus Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar Global Compactor

Slide 50

Slide 50 text

@bwplotka Thanos Store Gateway Store Gateway Object Storage SSD Prometheus Sidecar SSD Prometheus Sidecar SSD Prometheus Sidecar Thanos Ruler Thanos Ruler Global Compactor Thanos Querier Thanos Querier Thanos Querier

Slide 51

Slide 51 text

Deployment Models

Slide 52

Slide 52 text

@bwplotka Just Global Querier Querier Querier Querier Cluster A (master) Cluster B Cluster C + + + + + +

Slide 53

Slide 53 text

@bwplotka Federated Querier Querier Querier Querier Querier Querier Querier … Querier Querier Querier Cluster A (master) Cluster B Cluster C + + Federation (through Store API) + + + +

Slide 54

Slide 54 text

@bwplotka Federated Deployment Querier Querier Querier Store Bucket Querier Querier Querier … Store Bucket Querier Querier Querier Store Bucket Cluster A (master) Cluster B Cluster C + + Federation (through Store API) + + + +

Slide 55

Slide 55 text

@bwplotka Hierarchical Deployment Cluster 1 Cluster 2 + Cluster n Cluster n+1 + ... + Monitoring Cluster Grafana Alertmanager Bucket Compactor Querier Querier Querier Ruler Store Statically configured / global discovery +

Slide 56

Slide 56 text

@bwplotka Hierarchical Deployment ++++ ++ ++ ++++ ++ ++ ++++ ++ ++ Testing Staging Production Querier Querier Querier

Slide 57

Slide 57 text

@bwplotka Thanos Goals ● Have a global view ✓ ● Have a HA in place ✓ ● Increase retention ✓

Slide 58

Slide 58 text

@bwplotka Next steps ● New object store providers! ● Memory optimizations ● Prometheus: Stream remote read ● Multi Tenancy extension ● Query safeguards

Slide 59

Slide 59 text

@bwplotka Summary ● Start small with just single Prometheus. ● Extend your setup gradually. ○ Just global querier. ○ Federated global querier. ○ Object storage. ● Model your Thanos deployment.

Slide 60

Slide 60 text

github.com/improbable-eng/thanos Bartek Płotka @bwplotka improbable.io ...psst, join our slack workspace for more info! Credits: ● Percona blog post about Prometheus 2 perf ● Emojis designed by Freepik from Flaticon