Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Metric-Driven Decision Making with Custom Prometheus Exporter

Metric-Driven Decision Making with Custom Prometheus Exporter

Cloud Native Days Spring 2021 Online
https://event.cloudnativedays.jp/cndo2021/talks/681

93c80c388fe9d8f9df7d030549a0ff0b?s=128

Takeshi Kondo

March 02, 2021
Tweet

Transcript

  1. Metric-Driven Decision Making with Custom Prometheus Exporter Takeshi Kondo /

    @chaspy 2021/03/12 Cloud Native Days Spring 2021 Online
  2. Who am I chaspy chaspy_ Lead Software Engineer Site Reliability

    at Quipper Takeshi Kondo
  3. Metric-Driven Decision Making ϝτϦοΫΛجʹͨ͠ҙࢥܾఆ

  4. ͜Μͳ͜ͱɺ͋ΔΑͶʁ • ࠷ۙ ◦◦ ͕ [ଟ͍ | গͳ͍ | ଎͍

    | ஗͍] ؾ͕͢Δʁ • ͨͿΜ • ͓ͦΒ͘ • ͖ͬͱ • ஌ΒΜ͚Ͳ
  5. ೔ৗۀ຿ͷ͋ΒΏΔࣄ৅ • Deploy Αࣦ͘ഊ͢ΔͶɺRerun ͯ͠ΔͶ • ͋ͷαʔόΑ͘Ԡ౴͠ͳ͘ͳΔͶɺRestart ͯ͠ΔͶ • ηΩϡϦςΟҧ൓ͯ͠ΔͶɺ௨஌དྷͯΔ͚Ͳ୭΋Έͯͳ͍Ͷ

    • νέοτͨ·ͬͯΔͶɺͣͬͱͨ·ͬͯΔͶ
  6. ͜Μͳ͜ͱɺ͋ΔΑͶʁ • ࠷ۙ ◦◦ ͕ [ଟ͍ | গͳ͍ | ଎͍

    | ஗͍] ؾ͕͢Δʁ ࣮ࡍʹͲΕ͙Β͍ͳͷʁ ͲΕ͙Β͍ͳΒڐ༰Ͱ͖Δͷʁ ͲΕ͙Β͍ͳΒΞΫγϣϯΛى͜͢ͷʁ
  7. Metric-Driven Decision Making ϝτϦοΫΛجʹͨ͠ҙࢥܾఆ

  8. ຊ೔ͷΰʔϧ • "ͳΜͱͳ͘”໰୊Λײ͍ͯͯ͡ఆྔతʹղܾ͍ͨ͠ͻͱ͕ • Metric-Driven ͳ໰୊ղܾͷϝϦοτΛ஌Γ • Prometheus Exporter Λࣗ࡞͢Δํ๏Λ஌Γ

    • ਎ۙͳ໰୊ղܾͷͨΊͷώϯτΛಘΔ͜ͱ
  9. ຊ೔͓࿩͢͠Δ͜ͱ • Prometheus Exporter ͱ͸Կ͔ • Prometheus ܗࣜ / OpenMetrics

    ͱ͸Կ͔ • Prometheus Exporter ͷࣗ࡞ํ๏ • ࣄྫ঺հ • ·ͱΊ
  10. Prometheus Exporter • Prometheus: ૯߹؂ࢹ OSS • Prometheus Exporter: •

    Prometheus ͷίϯϙʔωϯτͷ1ͭ • Prometheus server(metrics Λऩू͢Δ܅) ʹ metrics Λެ։͢Δ܅ • Official / 3rd-party ؚΊͨ͘͞Μͷछྨ͕ଘࡏ͢Δ EXPORTERS AND INTEGRATIONS: https://prometheus.io/docs/instrumenting/exporters/
  11. Prometheus Architecture https://prometheus.io/docs/introduction/overview/#architecture

  12. Prometheus Architecture https://prometheus.io/docs/introduction/overview/#architecture

  13. Prometheus ܗࣜ / OpenMetrics ͱ͸Կ͔ https://openmetrics.io

  14. Open Metrics • Prometheus exposition format 0.0.4 ͔Β֦ு • Metrics

    ͷඪ४ԽΛ໨ࢦ͢ • OpenMetrics specifies today's de-facto standard for transmitting cloud-native metrics at scale, with support for both text representation and Protocol Buffers and brings it into IETF. It supports both pull and push-based data collection. • Metrics ͱ͸: Ұ࿈ͷσʔλͷݱࡏͷ snapshot (Log ΍ Event ͱ͸ҟͳΔ) • HTTP GET /metrics ʹରͯ͠ Openmetrics ܗࣜͷ metrics Λެ։ • ͦͷଞ͍Ζ͍Ζ
  15. Prometheus Exporter Λࣗ࡞͢Δ • Client library ͕͋Δ • ཁ݅ •

    http hostname:8080/metric Λ export (port ͸ͳΜͰ΋͍͍) • ܾ·ͬͨܗࣜͰ metric value Λ return • ࣗ࡞ͨ͠Ϟνϕʔγϣϯ • ۀ຿Ͱղܾ͍ͨ͠՝୊͕͋ͬͨ • ཁ݅Λຬͨ͢ Exporter ͸ͳ͔ͬͨɹ • Go ॻ͘܇࿅
  16. ࣮ߦ؀ڥ Kubernetes Integrations Autodiscovery https://docs.datadoghq.com/agent/kubernetes/integrations/?tab=kubernetes ݩͱͳΔσʔλΛఏڙ͢Δ API YYYFYQPSUFS %FQMPZNFOU EBUBEPHBHFOU

    %BFNPOTFU Get http://host/metrics Get the data Send custom metrics Parse and expose the metric
  17. ࣄྫ঺հ • GitHub • Issue • Pull Request • CircleCI

    • Insights API • AWS • RDS ͷ Engine Version • RDS ͷ Max Connections
  18. Template: title • Problem: • How to solve: • Result:

  19. GitHub Issue • Problem • Open ͳϙετϞʔςϜͷ Issue ͕ͨ·͍ͬͯͨ •

    How to solve: • Label Λλάͱͯ͠෇༩ͯ͠ Issue ਺Λ export ͢Δ exporter Λॻ͍ͨ • Result: • ݱঢ়ͷ਺͕ՄࢹԽ͞Εͨ • NݸҎ্ͩͱด͡ʹ͍͜͏ɺͱҙࢥܾఆͰ͖ΔΑ͏ʹͳͬͨ https://github.com/chaspy/github-issue-prometheus-exporter
  20. GitHub Issue $ curl -s localhost:8080/metrics | grep github_issue_prometheus_exporter_issue_count github_issue_prometheus_exporter_issue_count{author="chaspy",label="SRE",number="27193

    ",repo="quipper/quipper"} 1 https://github.com/quipper/quipper/issues? q=is:open+label:Postmortem+label:SRE
  21. GitHub Pull Request • Problem • Renovate ͷ PR ͕ͨ·͍ͬͯͨ

    • How to solve: • Label Λλάͱͯ͠෇༩ͯ͠ PR ਺Λ export ͢Δ exporter Λॻ͍ͨ • Result: • ݱঢ়ͷ਺͕ϦϙδτϦͷ਺ͱͱ΋ʹՄࢹԽ͞Εͨ • NݸҎ্ͩͱด͡ʹ͍͜͏ɺͱҙࢥܾఆͰ͖ΔΑ͏ʹͳͬͨ https://github.com/chaspy/github-pr-prometheus-exporter
  22. GitHub Pull Request $ curl -s localhost:8080/metrics | grep github_pr_prometheus_exporter_pull_request_count

    github_pr_prometheus_exporter_pull_request_count{author="renovate[bot]",label="renovate:ingress- nginx,renovate:ingress-nginx/3.20.1",number="1739",repo="quipper/kubernetes-clusters",reviewer="chaspy"} 1 https://github.com/search? q=org:quipper+is:open+author:app/renovate
  23. CircleCI Insights • Problem: • ڊେͳ monorepo ͷ CI ͕஗͍ɺFlaky

    Test ͷ໰୊ͷ෼ੳ͕೉͍͠ • How to solve: • CircleCI Insights ͷ API ͷ Prometheus Exporter Λ࡞ͬͨ • Result: • ݁Ռ͕ՄࢹԽ͞ΕɺϘτϧωοΫͷ෦෼͔ΒվળՄೳʹͳͬͨ https://github.com/chaspy/circleci-insights-prometheus-exporter
  24. CircleCI Insights

  25. AWS RDS Engine Version • Problem: • RDS Engine Version

    ͷ EOL ৘ใʹؾͮ͘ͷ͕೉͍͠ • How to solve: • RDS Engine Version ͷ EOL ৘ใͷ Prometheus Exporter Λॻ͍ͨ • Result: • อ༗͢Δ RDS ͕ EOL ʹ͍ۙͮͨΒΞϥʔτΛඈ͹ͤΔΑ͏ʹͳͬͨ https://github.com/chaspy/aws-rds-engine-version-prometheus-exporter
  26. AWS RDS Engine Version

  27. AWS RDS Max Connections • Problem: • MaxConnections ͕ Metric

    ʹͳ͍ͷͰΞϥʔτઃఆͰ͖ͳ͍ • Connection ਺ͷ Anomaly Alert ͕͕͋ͬͨ False Positive ͕ى͖Δ • How to solve: • Max Connections ͷ Prometheus Exporter Λॻ͍ͨ • Result: • Max Connections ͱݱࡏͷ Connections Λ༻͍ͯΞϥʔτઃఆͰ͖ͨ https://github.com/chaspy/aws-rds-maxcon-prometheus-exporter
  28. AWS RDS Max Connections

  29. ·ͱΊ(1) • Prometheus Exporter ͸ϥΠϒϥϦ͕͋Γ؆୯ʹࣗ࡞Մೳ • Metric ͱͯ͠ѻ͏͜ͱʹΑΔϝϦοτ • Tag

    ʹΑͬͯϑΟϧλͰ͖Δ • ೚ҙͷᮢ஋ΛઃఆͰ͖Δ • ՄࢹԽʹΑΓ௕ظτϨϯυΛ೺ѲͰ͖Δ
  30. ·ͱΊ(2) • Event ΑΓ΋ Metric • ؔ৺͝ͱͷ Event Λ Slack

    ௨஌౳Α͘΍Δ͕Ξϯνύλʔϯ • ೝ஌ίετΛୣ͏͚ͩʹ͔͠ͳΒͳ͍ • ؔ৺ͷ Event Λঢ়ଶʹม׵͠ɺGauge ͱͯ͠ Export ͢Δͷ͕ྑ͍
  31. Metric-Driven Decision Making ϝτϦοΫΛجʹͨ͠ҙࢥܾఆ

  32. Metric ͱ͍͏ Fact Λجʹత֬ͳҙࢥܾఆΛ͍ͯ͜͠͏ • ࠷ۙ ◦◦ ͕ [ଟ͍ |

    গͳ͍ | ଎͍ | ஗͍] ؾ͕͢Δʁ ࣮ࡍʹͲΕ͙Β͍ͳͷʁ ͲΕ͙Β͍ͳΒڐ༰Ͱ͖Δͷʁ ͲΕ͙Β͍ͳΒΞΫγϣϯΛى͜͢ͷʁ
  33. Metric ͱ͍͏ Fact Λجʹత֬ͳҙࢥܾఆΛ͍ͯ͜͠͏ • ࠷ۙ ◦◦ ͕ [ଟ͍ |

    গͳ͍ | ଎͍ | ஗͍] ؾ͕͢Δʁ ࣮ࡍʹ࠷ۙ͸ n݅དྷ͍ͯΔ ຖ೔/िҰͰνΣοΫͯ͠n݅௒͑ͨΒΞΫγϣϯ ͠Α͏ ᮢ஋Λઃఆͯ͠ΞϥʔτΛઃఆ͠Α͏
  34. Metric-Driven Decision Making ೝ஌ίετΛݮΒ͠ɺఆྔ৘ใʹΑΓ త֬ͳҙࢥܾఆΛ͍ͯ͜͠͏

  35. Thank you! chaspy chaspy_ Lead Software Engineer Site Reliability at

    Quipper Takeshi Kondo