ΫϥυωΠςΟϒͳࢹΛMackerel Ͱ2019.12.23 Mackerel Day #2@fujiwara
View Slide
@fujiwara.BDLFSFMΞϯόαμʔ ʙHJUIVCDPNLBZBDFDTQSFTTP "NB[PO&$4σϓϩΠπʔϧHJUIVCDPNGVKJXBSBMBNCSPMM "84-BNCEBσϓϩΠπʔϧ
Game & Community
ΫϥυωΠςΟϒʁ$/$'$MPVE/BUJWF%FpOJUJPOWΫϥυωΠςΟϒٕज़ɺύϒϦοΫΫϥυɺϓϥΠϕʔτΫϥυɺϋΠϒϦουΫϥυͳͲͷۙతͰμΠφϛοΫͳڥʹ͓͍ͯɺεέʔϥϒϧͳΞϓϦέʔγϣϯΛߏங͓Αͼ࣮ߦ͢ΔͨΊͷೳྗΛ৫ʹͨΒ͠·͢ɻ͜ͷΞϓϩʔνͷදྫʹɺίϯςφɺαʔϏεϝογϡɺϚΠΫϩαʔϏεɺΠϛϡʔλϒϧΠϯϑϥετϥΫνϟɺ͓ΑͼએݴܕAPI͕͋Γ·͢ɻIUUQTHJUIVCDPNDODGUPDCMPCNBTUFS%&'*/*5*0/NE
ίϯςφɺαʔϏεϝογϡɺϚΠΫϩαʔϏε…ͦΕΒͷٕज़Λ͍ͬͯΔ㱠ΫϥυωΠςΟϒ"Design for Failure" ͱͦΕΛ࣮ݱ͢ΔΈίϯϙʔωϯτɺͦΕΛ͍͜ͳ͢͜ͱ͕ͦ͜ΫϥυωΠςΟϒোͷൃੜʹରͯࣗ͠ಈ෮چͰ͖ΔΑ͏ʹσβΠϯ͢ΔɻোͷൃੜʹΑΔϢʔβʔӨڹ͕ͦͦͳ͍Α͏ʹΞʔΩςΫνϟΛσβΠϯ͢Δ2IUUQTTQFBLFSEFDLDPNUPSJDMTEFTJHOGPSGBJMVSFJTUIFUSVFDMPVEOBUJWF
"Design for Failure"োΛఆͯ͠γεςϜΛσβΠϯ͢Δw Πϯελϯεࢮ͵w ϚωʔδυαʔϏεࢮ͵ ͚ͲେGBJMPWFS͢Δw σʔληϯλʔ͝ͱࢮ͵ كʹ͋ΔͲ͜·ͰΛఆ͠ɺͲ͔͜ΒఘΊΔ͔ࢹͦΕΒΛંΓࠐΜͰઃܭ͢ΔˠΫϥυωΠςΟϒͳࢹ
ࢹରಈతʹ૿͑ͨΓݮͬͨΓ͢ΔίϯςφσϓϩΠ͝ͱʹੜ·ΕมΘΔίϯςφͰͳͯ͘w &$ͷΦʔτεέʔϦϯάͰ4QPUΠϯελϯεΛ͏w ΘΓͱ͙͢མͪΔɺଞͷ্͕͕ΔϚωʔδυαʔϏεෛՙʹԠͯ͡૿ݮͰ͖ΔΑ͏ʹͳ͖ͬͯͨ"VSPSB"VUP4DBMJOHͳͲ
Mackerel ͰΫϥυωΠςΟϒͷୈҰาʮDPOOFDUJWJUZݕΛΊΔʯ
connec%vity ݕΛΊΔ͜ͷΞϥʔτ!$SJUJDBM͔͠ͳ͍w ։ൃڥ͕ਂʹམͪͯඈͼى͖Δඞཁͳ͍w ຊ൪ڥͰམͪͯαʔϏεʹӨڹ͢Δϗετͳ͍Α͏ʹ࡞Δϗετ͕ࢮ͵͜ͱΛલఏʹσβΠϯ͢ΔʹΫϥυωΠςΟϒ$SJUJDBMͳΞϥʔτʮαʔϏεͷܧଓੑʹӨڹ͢ΔͷʯͷΈ
Mackerel ͷΫϥυωΠςΟϒ͞!ϗετ͕૿͑ͯݮͬͯࣗಈͰैͰ͖ΔNBDLFSFMDPOUBJOFSBHFOU "84"[VSFΠϯςάϨʔγϣϯ!ୀ͢ΔͱϗετϝτϦοΫ͕ݟ͑ͳ͘ͳΔ$16ͳͲҰ෦ͷϝτϦοΫͷΈΔɺΧελϜϝτϦοΫ߹ܭฏۉͳͲظͰ͍͍͕ͨɺফ͑ͯ͠·͏!ϗετ୯Ґ՝ۚϚΠΫϩϗετ ԁ݄Ͱ424-BNCEBͷΑ͏ʹ͍҆ɺϝτϦοΫ͕গͳ͍ରͷࢹʹͪΐͬͱߴ͍ʜ૯ϝτϦοΫ՝ۚϓϥϯ͕΄͍͠
ϚωʔδυαʔϏεͷਐԽʹظͭͭ͠ΫϥυωΠςΟϒͳࢹΛਐΊΔͨΊʹʮܺؒՈ۩044ʯϚωʔδυαʔϏεͷػೳαʔϏεؒ࿈ܞ͕ࣗͨͪͷӡ༻ʹ͓͍ͯෆेͳ߹ʹɺͦͷ伱ؒΛຒΊͯΑΓΑ͍ӡ༻Λ࣮ݱ͢ΔͨΊʹ։ൃ͞ΕͨιϑτΣΞɻಛʹOSSͷͷΛࢦ͢ɻ3IUUQTTQFBLFSEFDLDPNGVKJXBSBYJKJBOKJBKVPTTGBMTFTVTVNFIUUQTTQFBLFSEFDLDPNGVKJXBSBBXTEFWEBZUPLZP
ࠓհ͢Δ伱ؒՈ۩ OSSw NBQSPCFw NBDLFSFMQMVHJOQSPNFUIFVTRVFSZIUUQTHJUIVCDPNGVKJXBSBNBDLFSFMQMVHJOQSPNFUIFVTRVFSZIUUQTHJUIVCDPNGVKJXBSBNBQSPCF
ʲ՝ʳAWSΠϯςάϨʔγϣϯͰొ͞ΕͨϗετͰmackerel-plugin ͰͷϝτϦοΫऔΓ͍ͨ
ྫɿAmazon RDS(MySQL)ʹରͯ͠mackerel-plugin-mysql Λ࣮ߦ
ʲղ๏ʳͲ͔͜ͷϗετͷ mackerel-agent Ͱ plugin ࣮ߦʁ[plugin.metrics.rds01]command = "mackerel-plugin-mysql -host='rds01.***.ap-northeast-1.rds.amazonaws.com' (ུ)"custom_identifier = "rds01.***.ap-northeast-1.rds.amazonaws.com"[plugin.metrics.rds02]command = "mackerel-plugin-mysql -host='rds02.***.ap-northeast-1.rds.amazonaws.com' (ུ)"custom_identifier = "rds02.***.ap-northeast-1.rds.amazonaws.com"!͜ͷϗετ͕མͪͨΒϝτϦοΫऩू͕ࢭ·Δ"͋ͱ͔Β૿͑ͨϗετΛࢹ͢Δͷʹઃఆมߋ͕໘#࠷ۙNBDLFSFMBHFOU Λಈ͔͢ϗετ͕ͳ͍͜ͱʜࢹରͷ૿ݮʹࣗಈै͍ͨ͠ʂ
maprobew .BDLFSFMʹొ͞Εͨϗετʹରͯ͠w ֎ܗࢹQJOHUDQIUUQw NBDLFSFMQMVHJOΛ࣮ߦϗετϝτϦοΫͱͯ͠ߘw ొࡁΈͷϗετϝτϦοΫΛू͠αʔϏεϝτϦοΫͱͯ͠ߘΛߦ͏ͨΊͷΤʔδΣϯτ
ʲղ๏ʳ maprobe Ͱ plugin ࣮ߦprobes:- service: productionrole: RDScommand:command:- 'mackerel-plugin-mysql'- '-host={{.Host.CustomIdentifier}}'- '-username=root'- '-password={{env "RDS_PASSWORD"}}'αʔϏεQSPEVDUJPO ϩʔϧ3%4ͷϗετશͯʹରͯ͠NBDLFSFMQMVHJONZTRMΛ࣮ߦ݁ՌΛݸʑͷϗετϝτϦοΫͱͯ͠.BDLFSFMૹ৴͢Δ
maprobe ରϗετͷ૿ݮʹࣗಈैຖ.BDLFSFM"1*Λୟ͍ͯϗετΛݕࡧ!ϗετͷ૿ݮʹࣗಈͰै%PDLFSίϯςφΞϦ㽂docker pull fujiwara/maprobe4ʹஔ͍ͨઃఆϑΝΠϧΛࣗಈͰ࠶ಡΈࠐΈ!4Λߋ৽͢ΕίϯςφϏϧυɾσϓϩΠෆ༻Ͱઃఆөmaprobe agent --config s3://example.com/config.yamlIUUQTIVCEPDLFSDPNSGVKJXBSBNBQSPCF
ʲ՝ʳconnec%vity ΛΊͨΒϗετͷࢮ׆ࢹͲ͏͢Δʁ
ʲղ๏ʳmaprobe ͷϔϧενΣοΫػೳΈࠐΈͷϔϧενΣοΫػೳQJOH 5$1 )551͕͋Δprobes:- service: productionrole: EC2ping:address: "{{ .Host.IPAddresses.eth0 }}"- service: productionrole: ElastiCacheRedistcp:host: "{{ .Host.CustomIdentifier }}"port: 6379send: "PING\n"expect_pattern: "PONG"
maprobe ͰͷϔϧενΣοΫNBQSPCFͷϔϧενΣοΫ݁ՌϗετϝτϦοΫʹͳΔDIFDLࢹͰͳ͍
check ࢹ͕Α͘ͳ͍ͱ͜Ζ(ࢲݟ)ઃఆมߋ͕ϑΝΠϧͷमਖ਼σϓϩΠʮͪΐͬͱ͍·͚ͩࢹP⒎ᮢมߋʯ͕͍͠ᮢͷධՁํ๏͕ϓϥάΠϯ͝ͱʹ·ͪ·ͪ--critical-underʮҎԼʯ͔ʮະຬʯ͔ʜҰʹଟͷϗετͰൃใ͕ͪ͠ेϗετ͔ΒDIFDLࢹࣦഊ͕དྷͯݪҼݸͩͬͨΓOUQͷ࣌ࠁͣΕɺEBFNPOͷઃఆ EFQMPZϛεʜ
check ࢹ = metric ࢹͷಛघͳύλʔϯϝτϦοΫΛอଘͯͦ͠ΕΛධՁ͢Εಉ͜͡ͱ͕Ͱ͖ΔϝτϦοΫࢹɺࣜࢹΛ׆༻͢Δ
ྫɿ ping ʹΑΔࢮ׆ࢹsum(role(production:EC2, ping.count.failure))QSPDVUDJPO&$ͷ͍ͣΕ͔ͷϗετʹQJOH͕ࣦഊͨ͠ΒXBSOԿ͔མͪͯαʔϏε͕ఏڙͰ͖͍ͯΕ$SJUJDBMͰͳ͍
ྫɿ job queue ͷཹ job ΛΞϥʔτsum(role(production:job-queue, custom.gearmand.queue.*.total))ෳͷϗετʹKPCRVFVF͕͋ΔཹKPCΛϝτϦοΫʹ͍ͯ͠Δཷ·Δͱ͖શͯͷRVFVF͕ཷ·Δ͜ͱ͕ଟ͍DIFDLࢹͰݸผʹΞϥʔτ͢Δͱશ෦ͷϗετͰൃใ͕ͪ͠߹ܭΛݟΔ͜ͱͰશମͷॲཧঢ়گΛѲ͢Δ
ʲ՝ʳୀ͢Δͱফ͑ͯ͠·͏ϗετϝτϦοΫΛ͍͍ͨ
ΧελϜϝτϦοΫϗετ͕ୀ͢Δͱফ͑ͯ͠·͏
ʲղ๏ʳmaprobe ͰϗετϝτϦοΫΛूอଘaggregates:- service: productionrole: push-servermetrics:- name: custom.push.messages.sentoutputs:- func: sumname: custom.push.messages.total_sentαʔϏεQSPEVDUJPO ϩʔϧQVTITFSWFSʹରͯ͠ϗετϝτϦοΫͷQVTINFTTBHFTTFOUΛશऔಘˠԋࢉͨ݁͠ՌΛαʔϏεϝτϦοΫͱͯ͠อଘ͢Δ
maprobe aggregate func/onsݱࡏTVN NJO NBY BWFSBHF DPVOUΛαϙʔτQFSDFOUJMF͋ͬͨ΄͏͕Αͦ͞͏͚ͩͲະ࣮ΧελϜϝτϦοΫ͕ফ͑ͳ͚ΕࣜάϥϑͰ݁ͳͷͰԿଔ
ʲ՝ʳͬͱΫϥυωΠςΟϒͳϦιʔεͷࢹ
ϚΠΫϩαʔϏεʂ αʔϏεϝογϡʂ&OWPZΛ͍࢝Ί͍ͯΔͷͰɺϝτϦΫεΛऔΓ͍ͨͱ͋Δ&OWPZͷ/statsΛୟ͘ͱʜ$ curl -s x.x.x.x:9901/stats...cluster.web.default.total_match_count: 1cluster.web.external.upstream_rq_200: 988cluster.web.external.upstream_rq_2xx: 988cluster.web.external.upstream_rq_302: 13cluster.web.external.upstream_rq_3xx: 13cluster.web.external.upstream_rq_400: 3cluster.web.external.upstream_rq_403: 26cluster.web.external.upstream_rq_404: 5cluster.web.external.upstream_rq_4xx: 34....IUUQTFOWPZQSPYZJP
͜ΕΛશ෦ Mackerel ʹૹΕΑ͍ʁcluster.web.default.total_match_count: 1cluster.web.external.upstream_rq_200: 988cluster.web.external.upstream_rq_2xx: 988cluster.web.external.upstream_rq_302: 13cluster.web.external.upstream_rq_3xx: 13cluster.web.external.upstream_rq_400: 3cluster.web.external.upstream_rq_403: 26cluster.web.external.upstream_rq_404: 5cluster.web.external.upstream_rq_4xx: 34...
͍߹ΘͤϑΟʔυόοΫϑΥʔϜ͔Βؾܰʹฉ͍ͯΈͨۙʑenvoy(https://www.envoyproxy.io/)Λಋೖ༧ఆͳͷͰ͕͢ɺMackerelެࣜͱͯ͠envoy statsऔಘϓϥάΠϯΛެ։͞ΕΔ༧ఆ͋Γ·͢Ͱ͠ΐ͏͔ʁenvoyͷϓϥάΠϯͰ͕͢ɺݱ࣌Ͱެࣜͱͯ͠ެ։͢Δ༧ఆ͍͟͝·ͤΜɻ!࡞Δ͔ʜʜ
͔͠͠ Envoy େྔͷϝτϦοΫΛు͖ग़͢$ curl -s x.x.x.x:9901/stats | wc -l337͜ΕΛશ෦.BDLFSFMʹ͍࣋ͬͯ͘ͱʜϚΠΫϩϗετϝτϦοΫϗετˠϗετ૬&OWPZ͕͍Δϗετ͝ͱʹYԁ ੫ผ
࡞ઓมߋશ෦Λ͍࣋ͬͯ͘ͷίετ͕ݫ͍͠ͱ͍͑ͲΜͳ͕΄͍͔͠Α͔͍ͬͯ͘ͳ͍&OWPZӡ༻ܦݧ͕ઙ͍ͷͰɺӡ༻͠ͳ͕ΒݟΔΛܾΊ͍ͨQMVHJOΛ࡞ͬͯ༗༻ͳϝτϦοΫΛऔΔʹӡ༻ܦݧ͕ඞཁͭ·ΓͱΓ͋͑ͣશ෦औΓ͍ͨ
Prometheus Ͳ͏͔ʁ1VMMܕϝτϦοΫऩूɾࢹπʔϧอଘͨ͠Λ1SPN2-ΫΤϦͰॊೈʹՃͯ͠औಘͰ͖ΔظؒͷϝτϦοΫอଘ͋·Γߟྀ͞Ε͍ͯͳ͍(SBGBOBͱ͔ͰՄࢹԽ͕ී௨ʁ
!Envoy → Prometheus → Mackerelզʑ.BDLFSFMͰΞϥʔτ͍ͨ͠͠άϥϑݟ͍ͨͲͷΛऔΔ͖͔͕ݟ͑ͳ͍ͷͰɺۙશ෦औ͓͖͍ͬͯͨ &OWPZˠ1SPNFUIFVT ͜Ε͋Δ 1SPNFUIFVTˠΫΤϦ݁ՌΛNFUSJDQMVHJOܗࣜͰग़ྗ NFUSJDQMVHJOܗࣜͷग़ྗΛ.BDLFSFMʹอଘ ͜Ε͋Δ͜ͷ͚ͩ࡞ΕΑ͍ͷͰʂΓͳ͍1SPNFUIFVTʹΫΤϦͯ͠औΕΑ͍
mackrel-plugin-prometheus-query࡞Γ·ͨ͠HJUIVCDPNGVKJXBSBNBDLFSFMQMVHJOQSPNFUIFVTRVFSZ$ mackerel-plugin-prometheus-query \-query "up" \-metric-key-format "promq.{job}.{instance}"promq.web.10_1_129_175_9901 1 1575941187promq.web.10_1_130_170_9901 1 1575941187promq.web.10_1_131_53_9901 1 1575941187promq.prometheus.localhost_9090 1 1575941187
Prometheus ͷΫΤϦྫྫɿؒͷVQTUSFBNͷϦτϥΠճΛٻΊΔΫΤϦsum(delta(envoy_cluster_upstream_rq_retry{envoy_cluster_name="web"}[1m])).BDLFSFMʹϗετ୯ҐͰͳ͘શମͷΛૹΔݸʑͷ1SPNFUIFVTΛݟΕ͋ΔͷͰ
plugin ͷग़ྗΛ mkr throw Ͱ͛Δ$ mackerel-plugin-prometheus-query \-query 'sum(delta(envoy_cluster_upstream_rq_retry_success{envoy_cluster_name="web"}[1m]))' \-metric-key-format "envoy.web.upstream.retry.success" \| mkr throw --service production
·ͱΊΫϥυωΠςΟϒͳࢹͱোΛڐ༰͢Δઃܭશମͷ݈શੑΛݟΔͦͷಓ۩ͱͯ͠ͷܺؒՈ۩044w NBQSPCFࣗಈैɺ֎ܗࢹɺQMVHJOࢹɺϝτϦοΫूw NBDLFSFMQMVHJOQSPNFUIFVTRVFSZTIPSUUFSN1SPNFUIFVTͰMPOHUFSN BMFSUJOH.BDLFSFMͰ