Slide 1

Slide 1 text

Ϣʔβ໨ઢͰͷPrometheus ϞχλϦϯάษڧձ 2017/10/27 @matsumana

Slide 2

Slide 2 text

ࣗݾ঺հ • ໊લɿ দ࡚ ֶ • ॴଐɿ LINE Fukuokaגࣜձࣾ
 ։ൃࣨ • Twitterɿ @matsumana

Slide 3

Slide 3 text

ΞδΣϯμ • ࣮ࡍʹͲ͏͍ͬͨϝτϦΫεΛऔ͍ͬͯΔͷ͔ • exporterΛ࡞Δ࣌ͷϕετϓϥΫςΟε • Ξϥʔτϧʔϧͷॻ͖ํͷྫ • ·ͱΊ ※࣌ؒͷ౎߹্ɺPrometheusೖ໳తͳࣄ͸࿩͠·ͤΜ

Slide 4

Slide 4 text

લఏ • ݱࡏ࢖༻͍ͯ͠Δόʔδϣϯ • Prometheus 1.8.0

Slide 5

Slide 5 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • node_exporter • loading • CPU࢖༻཰ • ϝϞϦ࢖༻཰ • εϫοϓ • slab • context switch

Slide 6

Slide 6 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • node_exporter • σΟεΫI/O • σΟεΫ࢖༻཰ • inode਺ • TCP઀ଓ • ESTABLISHED • TIME_WAIT • ωοτϫʔΫ τϥϑΟοΫ

Slide 7

Slide 7 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • apache_exporter • mod_statusϞδϡʔϧͰऔΕΔϝτϦΫε • nginx_exporter • ngx_http_stub_status_moduleϞδϡʔϧͰऔΕΔϝτϦΫε

Slide 8

Slide 8 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • plack_exporter • Plack::Middleware::ServerStatus::LiteͰऔΕΔϝτϦΫε

Slide 9

Slide 9 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • jmx_exporter • JVMͷڞ௨తͳϝτϦΫε • Heap • Metaspace • GC • Thread਺ • Code cache

Slide 10

Slide 10 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • jmx_exporter • ϛυϧ΢ΣΞ/ϥΠϒϥϦݻ༗ͷϝτϦΫε • Tomcat • Jetty • HikariCP • Caffeine • etc…

Slide 11

Slide 11 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • memcached_exporter • ΩϟογϡαΠζ • Ωϟογϡ਺ • Ωϟογϡώοτ཰ • operation(get/set) • Eviction

Slide 12

Slide 12 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • mysqld_exporter • QPS • Slow query • Replicaton delay • binlog • connection਺ • InnoDBؔ࿈

Slide 13

Slide 13 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • redis_exporter • ࢖༻ϝϞϦ • Key਺ • Eviction • Slow query

Slide 14

Slide 14 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • elasticsearch_exporter • Cluster health • Document਺(primary shardͷΈ) • DocumentαΠζ(primary shardͷΈ) • Thread pool queue਺ • Thread pool rejected਺ • Circuit Breaker tripped਺

Slide 15

Slide 15 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • elasticsearch_exporter • Fielddata cache size • Query cache size • Query cache hit,miss,total • Request cache size

Slide 16

Slide 16 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • fluent-plugin-prometheus • ֤BufferedOutput pluginͷϝτϦΫε • buffer_queue_length • buffer_total_bytes • retry_count

Slide 17

Slide 17 text

ݱࡏ࢖͍ͬͯΔexporterͱɺऔಘ͍ͯ͠ΔϝτϦΫε
 ʢҰ෦ͷΈൈਮʣ • fluentd_exporter • fluentdϓϩηε୯ҐͷCPU࢖༻཰ • fluentdϓϩηε୯ҐͷϝϞϦ࢖༻཰ • td-agent_exporter • fluentdϓϩηε୯ҐͷCPU࢖༻཰ • fluentdϓϩηε୯ҐͷϝϞϦ࢖༻཰ • fluent-agent-lite_exporter • fluentdϓϩηε୯ҐͷCPU࢖༻཰ • fluentdϓϩηε୯ҐͷϝϞϦ࢖༻཰

Slide 18

Slide 18 text

exporterΛ࡞Δ࣌ͷϕετϓϥΫςΟε • https://prometheus.io/docs/instrumenting/exporters/ • https://github.com/prometheus/prometheus/wiki/Default- port-allocations • άάΔ • ࣗ෼͕ඞཁͳϝτϦΫε͕export͞Ε͍ͯͳΕ͹PR͢Δ ·ͣ͸ɺଞͷਓ͕࡞ͬͨ΋ͷ͕ແ͍͔୳ͯ͠ΈΔ

Slide 19

Slide 19 text

• ΦϑΟγϟϧͷΨΠυϥΠϯ
 https://prometheus.io/docs/instrumenting/writing_exporters/ exporterΛ࡞Δ࣌ͷϕετϓϥΫςΟε ࣗ෼Ͱ࡞Δ৔߹

Slide 20

Slide 20 text

• ΦϑΟγϟϧͷclient library
 https://prometheus.io/docs/instrumenting/clientlibs/ • ΦϑΟγϟϧ͸golang, Java, Python, RubyͷΈ • Third partyʹ͸PHPͳͲ΋͋Δ exporterΛ࡞Δ࣌ͷϕετϓϥΫςΟε ࣗ෼Ͱ࡞Δ৔߹

Slide 21

Slide 21 text

• exporter͸goͰॻ͘ͷ͕ྑͦ͞͏͚ͩͲ஫ҙ఺΋͋Δ • Linux Kernel 2.6.23ະຬ͸golangࣗମ͕αϙʔτͯ͠ͳ͍
 CentOS5ܥͩͱgolang੡ͷexporter͸ਖ਼͘͠ಈ͔ͳ͍͔΋
 https://github.com/golang/go/wiki/MinimumRequirements exporterΛ࡞Δ࣌ͷϕετϓϥΫςΟε ࣗ෼Ͱ࡞Δ৔߹

Slide 22

Slide 22 text

exporterΛ࡞Δ࣌ͷϕετϓϥΫςΟε ΦϑΟγϟϧΨΠυϥΠϯʹهࡌ͞Ε͍ͯΔ΋ͷΛҰ෦͝঺հ • ଞͷexporterͰ࢖ΘΕͯͳ͍portΛ࢖͏ɻ࡞ͬͨΒهࡌ͠Α͏
 https://github.com/prometheus/prometheus/wiki/Default-port- allocations • ϝτϦΫε໊ɺϥϕϧ໊ͷϕετϓϥΫςΟεʹै͏
 https://prometheus.io/docs/practices/naming/ • exporterͰϝτϦΫε஋Λܭࢉ͠ͳ͍ɻPromQLͰܭࢉ͢Δ • ྫʣΩϟογϡώοτ཰ͳͲ

Slide 23

Slide 23 text

exporterΛ࡞Δ࣌ͷϕετϓϥΫςΟε • exporterͷόʔδϣϯ൪߸ΛϝτϦΫεͱͯ͠export͢Δ • ؂ࢹର৅ͷαʔό΍exporterͷ਺͕૿͑ΔͱɺͲͷόʔδϣ ϯ͕σϓϩΠ͞Ε͍ͯΔ͔Θ͔Βͳ͘ͳΔ • ͪͳΈʹɺPrometheusຊମ͸ҎԼͷΑ͏ͳϝτϦΫεΛ export͍ͯ͠Δ
 prometheus_build_info{branch=“master",goversion="go1.8.3",revision="3afb3fffa3a29 c3de865e1172fb740442e9d0133",version="1.7.1"} 1 ΦϑΟγϟϧΨΠυϥΠϯʹهࡌ͞Ε͍ͯΔ΋ͷΛҰ෦͝঺հ

Slide 24

Slide 24 text

exporterΛ࡞Δ࣌ͷϕετϓϥΫςΟε ΦϑΟγϟϧΨΠυϥΠϯʹهࡌ͞Ε͍ͯΔ΋ͷΛҰ෦͝঺հ • ̋̋̋_up ϝτϦΫεΛexport͢Δ • ྫ͑͹ɺMySQLͷ৔߹͸mysql_upͱ͍͏ϝτϦΫε • ؂ࢹର৅͔ΒϝτϦΫεऔಘ੒ޭͳΒ1ɺࣦഊͳΒ0Λexport • ཧ༝͸ޙड़

Slide 25

Slide 25 text

Ξϥʔτϧʔϧͷॻ͖ํͷྫ • ྫ1) exporter͕མ͍ͪͯͳ͍͔Ͳ͏͔؂ࢹ͢Δ
 
 
 
 • ྫ2) exporertͷ؂ࢹର৅͕མ͍ͪͯͳ͍͔Ͳ͏͔؂ࢹ͢Δ
 
 
 • ྫ3) dailyόον͕ਖ਼ৗʹಈ͍͍ͯΔ͔Ͳ͏͔؂ࢹ͢Δ

Slide 26

Slide 26 text

Ξϥʔτϧʔϧͷॻ͖ํͷྫ • Prometheusαʔόࣗ਎͕ɺશͯͷexporterʹ͍ͭͯ
 upϝτϦΫεΛه࿥͍ͯ͠Δ
 https://prometheus.io/docs/concepts/jobs_instances/
 
 ྫʣnode_exporterͷ৔߹ up{instance="host00:9100", job="node"} 1
 ྫ1ʣexporter͕མ͍ͪͯͳ͍͔Ͳ͏͔؂ࢹ͢Δ

Slide 27

Slide 27 text

Ξϥʔτϧʔϧͷॻ͖ํͷྫ ALERT ExporterDown IF up == 0 FOR 1m LABELS {severity="major"} ANNOTATIONS { description="{{ $labels.instance }} of job {{ $labels.job }} has been down” } ྫ1ʣexporter͕མ͍ͪͯͳ͍͔Ͳ͏͔؂ࢹ͢Δ • ͳͷͰɺҎԼͷ༷ͳΞϥʔτϧʔϧΛ1ͭొ࿥͢Ε͹
 શͯͷexporterʹରԠͰ͖Δ

Slide 28

Slide 28 text

Ξϥʔτϧʔϧͷॻ͖ํͷྫ ྫ̎ʣexporertͷ؂ࢹର৅͕མ͍ͪͯͳ͍͔Ͳ͏͔؂ࢹ͢Δ • ҎԼͷ༷ͳΞϥʔτϧʔϧΛ1ͭొ࿥͢Ε͹ɺ
 ̋̋̋_upϝτϦΫεΛ࣋ͭશͯͷexporterʹରԠͰ͖Δ ALERT ExporterTargetDown IF {__name__=~".*_up"} == 0 FOR 1m LABELS {severity="major"} ANNOTATIONS { summary="Exporter target ( {{ $labels.job }} ) down" }

Slide 29

Slide 29 text

Ξϥʔτϧʔϧͷॻ͖ํͷྫ • node_exporterͷTextfile CollectorػೳΛ࢖ͬͯϝτϦΫεΛexport͢Δ
 (pushgatewayΛ࢖ͬͯ΋ϝτϦΫε͸exportग़དྷΔ) • ͜Μͳײ͡ͷshellΛjobεέδϡʔϥͰ࣮ߦ͢ΔΑ͏ʹͯ͠ɺ ྫ̏ʣdailyόον͕ਖ਼ৗʹಈ͍͍ͯΔ͔Ͳ͏͔؂ࢹ͢Δ #!/bin/bash # ్தͰΤϥʔ͕ൃੜͨ͠Βऴྃ͢Δ set -e UNIXTIME=$(date +%s) RESULT=/path/to/node_exporter_textfile_directory/my_batch_job.prom # execute job /path/to/my_batch_job.sh # write metrics echo my_batch_job_completion_time{cron="my_batch_job", period="daily"} $UNIXTIME > $RESULT.tmp mv $RESULT.tmp $RESULT

Slide 30

Slide 30 text

Ξϥʔτϧʔϧͷॻ͖ํͷྫ • ҎԼͷ༷ͳΞϥʔτϧʔϧΛొ࿥͢Ε͹ݕ஌Մೳ ྫ̏ʣdailyόον͕ਖ਼ৗʹಈ͍͍ͯΔ͔Ͳ͏͔؂ࢹ͢Δ ALERT DailyCronLate IF time() - {period="daily"} > 60 * 60 * 24 FOR 5m LABELS {severity="warning"} ANNOTATIONS { summary="Daily cronjob for {{ $labels.cron }} has not run in 24 hours” }

Slide 31

Slide 31 text

·ͱΊ • ࢖͍ͬͯΔexporterͱɺऔಘग़དྷΔϝτϦΫεͷҰ෦Λ͝঺հ ͠·ͨ͠ • exporterΛ࡞Δ৔߹ͷ஫ҙ఺(golang)ʹ͍ͭͯ͝঺հ͠·ͨ͠ • exporterΛ࡞Δ৔߹ͷϕετϓϥΫςΟεΛ͍͔ͭ͘ൈਮͯ͠ ͝঺հ͠·ͨ͠ • ΞϥʔτϧʔϧͷྫΛ͍͔ͭ͘͝঺հ͠·ͨ͠

Slide 32

Slide 32 text

Thank you :)