Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Storage in 2.0

Storage in 2.0

Orchestration systems such as Kubernetes are rapidly gaining traction and unlock features of highly dynamic environments, such as frequent rolling updates and auto-scaling, for everyone. This causes the number of series to blow up, putting new strains on Prometheus’ one-file-per-series data layout. This talk will be laying out the problems with the 1.0 storage and how the new storage engine, built from the ground up, solves them while also producing a sharp increase in performance. Storage 2.0 also has additional features, like granular, time-ranged deletes and live backups whose implementation would be discussed.

Goutham Veeramachaneni

July 11, 2017
Tweet

More Decks by Goutham Veeramachaneni

Other Decks in Technology

Transcript

  1. Storage in Prometheus 2.0 Goutham Veeramachaneni Intern @ CoreOS, Berlin

    Student at IIT Hyderabad, India : gouthamve : putadent
  2. How do you get these time-series? What you expose: requests_total{path="/status",

    method="GET"} requests_total{path="/status", method="POST"} requests_total{path="/", method="GET"} What prometheus scrapes: requests_total{path="/status", method="GET", instance="10.0.0.1:80"} requests_total{path="/status", method="POST", instance="10.0.0.1:80"} requests_total{path="/", method="GET", instance="10.0.0.1:80"}
  3. Scale 5 million active time series 30 second scrape interval

    1 month of retention 166,000 samples/second 432 billion samples 8 byte timestamp + 8 byte value ⇒ 7 TB on disk 3,000 - 15,000 microservice instances
  4. Scale 5 million active time series 30 second scrape interval

    1 month of retention 166,000 samples/second 432 billion samples 8 byte timestamp + 8 byte value ⇒ 600GB on disk 3,000 - 15,000 microservice instances
  5. How do you get these time-series? What you expose: requests_total{path="/status",

    method="GET"} requests_total{path="/status", method="POST"} requests_total{path="/", method="GET"} What prometheus scrapes: requests_total{path="/status", method="GET", instance="10.0.0.1:80"} requests_total{path="/status", method="POST", instance="10.0.0.1:80"} requests_total{path="/", method="GET", instance="10.0.0.1:80"}
  6. Scale 5 million active time series 150 million total time

    series 30 second scrape interval 1 month of retention 166,000 samples/second 432 billion samples 8 byte timestamp + 8 byte value ⇒ 7 TB on disk
  7. 1.0 Querying 1. Get series labels 2. Calculate Fingerprint 3.

    Add the fingerprint against the label { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } Fingerprint: 3300 To identify each series.
  8. 1.0 Querying 1. Get series labels 2. Calculate Fingerprint 3.

    Add the fingerprint against the label-value pair { __name__=”reqs”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } status=”200” : 1000 500000 99 1 1500 2 1001 5 1502 method=”GET” : 2 999999 4 3 1502 9 6 5 10 ...
  9. 2.0 Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, }

    • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  10. 2.0 Index { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, }

    status=”200”: 1 2 5 99 1000 1001 1500 1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  11. 2.0 Index status=”200”: 1 2 5 99 1000 1001 1500

    1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  12. 2.0 Index status=”200”: 1 2 5 99 1000 1001 1500

    1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  13. 2.0 Index status=”200”: 1 2 5 99 1000 1001 1500

    1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  14. 2.0 Index status=”200”: 1 2 5 99 1000 1001 1500

    1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  15. 2.0 Index status=”200”: 1 2 5 99 1000 1001 1500

    1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  16. 2.0 Index status=”200”: 1 2 5 99 1000 1001 1500

    1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 5 { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  17. 2.0 Index status=”200”: 1 2 5 99 1000 1001 1500

    1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 5 { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  18. 2.0 Index status=”200”: 1 2 5 99 1000 1001 1500

    1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 5 { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  19. 2.0 Index status=”200”: 1 2 5 99 1000 1001 1500

    1502 500000 method=”GET”: 2 3 4 5 6 9 10 1502 999999 ... Intersect: 2 5 1502 { __name__=”requests_total”, pod=”nginx-34534242-abc723 job=”nginx”, path=”/api/v1/status”, status=”200”, method=”GET”, } • Assign block-scoped ID to each series • Maintain sorted lists from label pair to IDs • Efficient k-way set operations
  20. Benchmarks Kubernetes cluster + dedicated Prometheus nodes 800 microservice instances

    + Kubernetes components 120,000 samples/second 300,000 active time series Swap out 50% of pods every 10 minutes
  21. Tombstones Series-Ref ---> Deleted Ranges { 190: [{100, 200}, {300,

    600}], 250: [{100, 5000}], } When the querier, runs it pickups these ranges. If something is deleted, we skip that range in the query.
  22. BACKUPS ARE HEEEERE! Step 1: Stop mutation and “save” the

    block (hard-link) Step 2: Persist in-memory blocks Step 3: Restart mutation Step 3: Backup the data lazily and safely Snapshots