分散システム内のプロセス間の関係性に着目したObservabilityツールの設計と実装 / Transtracer CNDK2019

分散システム内のプロセス間の関係性に着目したObservabilityツールの設計と実装 / Transtracer CNDK2019

CloudNative Days KANSAI 2019

A658ec7f1badf73819dfa501165016c1?s=128

Yuuki Tsubouchi (yuuk1)

November 28, 2019
Tweet

Transcript

  1. ͘͞ΒΠϯλʔωοτ גࣜձࣾ (C) Copyright 1996-2019 SAKURA internet Inc ͘͞ΒΠϯλʔωοτ ݚڀॴ

    ෼ࢄγεςϜ಺ͷϓϩηεؒͷؔ܎ੑʹ ண໨ͨ͠Observabilityπʔϧͷઃܭͱ࣮૷ ͘͞ΒΠϯλʔωοτݚڀॴ Yuuki Tsubouchi / @yuuk1t CloudNative Days Kansai 2019 2019.11.28
  2. 2 ࣗݾ঺հ Yuuki Tsubouchi / Ώ͏͏͖ https://yuuk.io/ ܦྺ גࣜձࣾ͸ͯͳ WebΦϖϨʔγϣϯΤϯδχΞɾSRE

    ͘͞ΒΠϯλʔωοτגࣜձࣾ ͘͞ΒΠϯλʔωοτݚڀॴ ݚڀһ WebαʔϏεͷ ։ൃɾӡ༻ Πϯλʔωοτ ج൫ٕज़ݚڀ 5೥ ݱࡏ Site Reliability Engineering(SRE) Researcher @yuuk1t ৘ใॲཧֶձ Πϯλʔωοτͱӡ༻ٕज़ݚڀձ ӡӦҕһ ηΩϡϦςΟɾΩϟϯϓશࠃେձߨࢣ id:y_uuki
  3. 3 ɾࠃ಺Ͱ͸طଘͷCloudNativeͳιϑτ΢ΣΞΛ͍͔ʹ͏·͘ʮ࢖ ͏ʯ͔ͱ͍͏࿩୊͕ࢧ഑త ɾCloudNativeͳιϑτ΢ΣΞΛʮ࡞Δʯ࿩΋͍͖͍ͯͨ͠ ɾݱ࣌఺Ͱ͸·ͩProof of Concept (PoC)Ͱ͋Γɼ໨ͷલͷ՝୊ʹର ͙ͯ͢͠ʹ໾ʹཱͭͱ͍͏Θ͚Ͱ͸ͳ͍ ɾͦΕͰ΋ɼʮ࡞Δʯͱ͍͏ಈػ෇͚Λޙԡ͍ͨ͠͠

    ຊߨԋͷϞνϕʔγϣϯ
  4. https://yuuk.io/papers/transtracer_iots2019.pdf Transtracer: ෼ࢄγεςϜʹ͓͚Δ TCP/UDP௨৴ͷऴ୺఺ͷ؂ࢹʹΑΔ ϓϩηεؒґଘؔ܎ͷࣗಈ௥੻

  5. 5 1. ͸͡Ίʹ 2. ObservabilityͱTracingٕज़ 3. Transtracerͷઃܭͱ࣮૷ 4. ࣮ݧ 5.

    ࠓޙͷల๬ 6. ·ͱΊ ໨࣍
  6. 1. ͸͡Ίʹ

  7. 7 ෼ࢄΞϓϦέʔγϣϯͷෳࡶԽͷഎܠ ௕ظؒͷαʔϏεఏ ڙதͷػೳ௥Ճ Ϣʔβʔ͔ΒͷΞΫ ηε૿Ճ ୯ҰͷαʔϏεࣄۀ ऀ͕ෳ਺ͷαʔϏε Λఏڙ (SNS,ECαΠτͳͲ)

    ༻్ಛԽͷϛυϧ΢ΣΞ௥Ճ ৽چγεςϜͷࠞ߹ εέʔϧΞ΢τʹΑΔϗετ਺ ͷ૿Ճ ෳ਺ͷαʔϏεͷҰ෦Λڞ༗ (Ϣʔβʔೝূج൫ͳͲ) Microservices
  8. 8 ɾ͋ΔίϯϙʔωϯτʹมߋΛՃ͑Δͱ͖ʹɼมߋͷӨڹൣғ͕ෆ໌ ɾௐࠪʹ࣌ؒΛཁ͢Δ͔ɼ΋͘͠͸มߋΛఘΊΔ ɾো֐ൃੜ࣌ʹɼ֤ॴͷҟৗͷ૬ؔؔ܎΍ҼՌؔ܎͕Θ͔Βͳ͍ ɾ֤छϝτϦοΫ΍ϩάΛ؅ཧऀ͕खಈͰಥ͖߹ΘͤͯݪҼΛ୳Δ ɾґଘؔ܎͸γεςϜͷՔಇதʹมߋ͞Ε͍ͯͨ͘ΊɼखಈͰυΩϡ ϝϯτΛߋ৽͠ଓ͚Δख͕ؒେ͖͍ ෳࡶԽʹΑΓੜ·ΕΔ໰୊ҙࣝ γεςϜ؅ཧऀʹͱͬͯະ஌ͷϓϩηεͷґଘؔ܎Λ ࣗಈͰ௥੻͢Δඞཁ͕͋Δ

  9. 9 ɾL7૚ͷϦΫΤετ͕γεςϜ಺ͷͲͷܦ࿏ΛͨͲΔ͔Λ௥੻͢Δ ɾ௨৴ܦ࿏্ͰϦΫΤετʹࣝผࢠΛຒΊࠐΉॲཧͱτϨʔεσʔλ Λه࿥͢Δॲཧ͕ඞཁ ɾ஗ԆΦʔόϔου: ࣝผࢠຒΊࠐΈॲཧͷͨΊͷΦʔόϔου ɾِӄੑ: ࣄલʹґଘΛ೺Ѳ্ͨ͠ͰࣝผࢠΛຒΊࠐΉͨΊɼະ஌ͷ ϓϩηεΛൃݟ͢Δ໨తʹ͸ద͞ͳ͍ ɾܭଌ४උίετ:

    طଘͷίϯϙʔωϯτʹ௥੻ͷͨΊͷॲཧΛ௥Ճ ͢Δख͕ؒ͋Δ طଘͷࣗಈ௥੻ٕज़ - ෼ࢄτϨʔγϯά
  10. 10 ՝୊Λղܾ͢ΔΞΠσΞ ໨త: ؅ཧऀʹͱͬͯະ஌ͷϓϩηεͱͷґଘΛ௥੻͢Δ ՝୊: 1.ґଘ௥੻ͷِӄੑ 2.஗ԆΦʔόʔϔου Linux OS Kernel

    Process Process TCP/UDP Flows … . . . User ιέοτΛ ؂ࢹ ղܾ: TCP/UDP઀ଓͷऴ ୺఺Ͱ͋ΔιέοτΛ؂ ࢹ͠ϑϩʔΛࣗಈ௥੻ ɾιέοτΛ࢖͍ͬͯ͑͞Ε ͹໢ཏతʹ௥੻Մೳ ɾιέοτͷ؂ࢹ͸ϓϩηε ͷ௨৴ͱ͸ಠཱ͍ͯ͠Δͨ Ίɼ஗ԆΦʔόϔου͸ͳ͍
  11. 11 ϗετ/PodΛى఺ͱͨ͠ґଘؔ܎औಘ ɾมߋର৅ͷϊʔυΛى఺ͱͯ͠ɼґଘؔ܎Λऔಘ͢Δ ɾ͜Ε͔Βߦ͏มߋʹରͯ͠Өڹ͢ΔൣғΛಛఆͰ͖Δ Target ɾιϑτ΢ΣΞ͕ࣗಈͰγεςϜ Λมߋ͢Δ৔߹ ɾSREͰ͍͏ͱ͜ΖͷΤϥʔό δΣοτͷ࢒ྔ͕গͳ͍ίϯ ϙʔωϯτؚ͕·ΕΔͱ؅ཧऀ

    ʹҰ୴௨஌
  12. 2. ObservabilityͱTracingٕज़

  13. ෼ࢄΞϓϦέʔγϣϯͷෳࡶԽ

  14. 14 ͜͜10೥΄Ͳͷ෼ࢄΞϓϦέʔγϣϯͷߏ੒ External DNS Server Application flow DNS flow RDB

    server Application server Web server Internal DNS server Full text search server KVS server Batch server ɾWeb3૚ߏ੒ʹՃ͑ͯɼNoSQLαʔόͳͲͷ௥Ճ
  15. 15 ୯ҰͷHost/Podͷߏ੒ Log collector agent Main network process Monitoring agent

    Proxy User Authentication DNS forwarder ɾϦόʔεϓϩΩγɼαΠυΧʔϓϩΩγ΁ͷ઀ଓ؅ཧͷҕৡ ɾϗετಉډܕͷϩάऩूΤʔδΣϯτɼϞχλϦϯάΤʔδΣϯτ
  16. 16 ෳ਺ͷγεςϜ͕݁߹͢Δߏ੒ͷීٴ Message Queue Reverse Proxy ɾϦόʔεϓϩΩγɼαΠυΧʔϓϩΩγ΁ͷ઀ଓ؅ཧͷҕৡ ɾϗετಉډܕͷϩάऩूΤʔδΣϯτɼϞχλϦϯάΤʔδΣϯτ

  17. Observability

  18. 18 Observability (Մ؍ଌੑ) ࢀߟ: [Sridharan 17] Cindy Sridharan, Monitoring in

    the time of Cloud Native, Velocity, 2017. Low Observability Human Systems Monitoring Systems High Observability Logs Metrics Alerting Checking Investigating Human Systems Monitoring Systems Logs Metrics Alerting Checking Investigating Traces Observability Systems top, sar, iostat, tail …
  19. 19 SREͱ͸৴པੑΛ੍ޚ͠ɼมߋ଎౓Λ޲্ͤ͞Δٕज़ମܥ SREͷจ຺͔ΒΈΔObservability Reliability 100% 99% ੍ޚ෯ ҰൠʹγεςϜมߋ࣌ʹ৴པੑΛ௿Լͤ͞ΔϦεΫ͕͋Δ ͍͔ʹϦεΫΛখ੍ͯ͘͞͠ޚ෯Λখ͘͞Ͱ͖Δ͔͕伴 ObservabilityΛߴΊΔ͜ͱʹΑΓ

    ɾϦεΫͷن໛Λ༧ଌͰ͖Δ (Transtracerͷ໨త) ɾൃੜͨ͠ϦεΫΛ࠷খݶʹཹΊΒΕΔ
  20. Tracingٕज़ͷ෼ྨ

  21. 21 Tracingٕज़ͷ෼ྨ ϦΫΤετϕʔε Ξϓϩʔν ΞϓϦέʔγϣϯ૚ͷϦΫΤετ͕ Ͳͷ௨৴ܦ࿏ΛͨͲΔ͔Λ௥੻͢Δख๏ ίωΫγϣϯϕʔε Ξϓϩʔν ऴ୺ϗετ্Ͱτϥϯεϙʔτ઀ଓͷঢ়ଶΛ औಘ͢Δ͜ͱʹΑΓґଘؔ܎Λ௥੻͢Δख๏

    ύέοτϕʔε Ξϓϩʔν ϗετ্΍εΠον্ͰύέοτϔομΛ؍ଌ ͢Δ͜ͱʹΑΓɼ ґଘؔ܎Λ௥੻͢Δख๏
  22. 22 ɾLayer7ͷ֤ϦΫΤετʹࣝผࢠΛׂΓৼΓɼޙଓͷϦΫΤετʹຒ ΊࠐΜ্ͩͰɼޙଓͷϓϩηε΁఻ൖͤ͞Δ ɾࣝผࢠΛཔΓʹɼϦΫΤετ͕γεςϜ಺ͷͲͷϓϩηεΛܦ༝͠ ͯॲཧ͞Ε͔ͨΛ௥੻ ɾར఺: ΞϓϦέʔγϣϯॲཧ಺༰΍L7ϓϩτίϧͷ৘ใΛ௥੻Մೳ ɾ՝୊: ஗ԆΦʔόϔουɼِӄੑɼܭଌ४උίετ (p.8ͱಉ༷)

    ϦΫΤετϕʔεΞϓϩʔν M. Y. Chen, et al., Pinpoint: Problem Determination in Large, Dynamic Internet Services, IEEE/IFIP International Conference on DSN, pp. 595–604 2002. P. Barham, et al., Magpie: Online Modelling and Performance-aware Systems, HotOS, pp. 85–90 2003. R. Fonseca, et al., X-Trace: A Pervasive Network Tracing Framework, USENIX Conference on NSDI, pp. 20–20 2007. B. H. Sigelman, et al., Dapper, a Large-Scale Distributed Systems Tracing Infrastructure, Technical report, Google 2010.
  23. 23 ɾݱࡏͷ෼ࢄτϨʔγϯάٕज़(OpenTelemetryͳͲ)ͷݪܕ ɾ௿ΦʔόʔϔουͱΞϓϦέʔγϣϯಁաੑ͕ಛ௃ Google Dapper [Sigelman 2010] [Sigelman 2010]: B.

    H. Sigelman, et al., Dapper, a Large-Scale Distributed Systems Tracing Infrastructure, Technical report, Google 2010. Figure 5. Dapper collection pipeline ɾదԠతαϯϓϦϯά ɾτϥϑΟοΫͷྲྀྔʹԠͯ͡Ϩʔτ มߋՄೳ ɾRPCܭଌϥΠϒϥϦ ɾεύϯͷ࡞੒ɼϩάΛॻ͖ग़͢ɼα ϯϓϦϯά͢ΔC++ͷϥΠϒϥϦ
  24. 24 ɾϒϥ΢βɼϞόΠϧΞϓϦɼόοΫΤ ϯυΛؚΉe2eτϨʔγϯά ɾ՝୊1. ec2ͷੑೳσʔλ͸࣮ߦϞσϧ ΍ཻ౓΍඼࣭͕ҟछࠞ߹ ɾ՝୊2. ๲େͳτϨʔεσʔλ ɾελοΫશମͷੑೳσʔλΛநग़͢Δ ͨΊͷύΠϓϥΠϯΛߏங

    ɾύΠϓϥΠϯͷ֤ஈ֊ͰΧελϚΠζ Facebook Canopy [Kaldor 2010] [Kaldor 2017]: J. Kaldor, et al., Canopy: An end-to-end performance tracing and analysis system, USENIX SOSP, 2017. Dapperͷ֦ு: ҟछࠞ߹σʔλͷ݁߹ΛՄೳ Figure 2.
  25. 25 ɾLinuxͷύέοτϑΟϧλ(iptables)Λར༻ͯ͠ɼLayer4઀ଓΛݕग़ ɾiptablesͰ઀ଓ։࢝ϩάΛऩू͠ɼϩάίϨΫλͰதԝͷσʔλ ϕʔεʹอଘ ɾLinuxΧʔωϧͷL4௨৴ػߏΛར༻͢ΔݶΓɼґଘΛ௥੻Մೳ ɾ՝୊1: LinuxΧʔωϧ಺ͷύέοτ୯Ґͷॲཧʹհೖ͢ΔͨΊɼϓ ϩηεͷ௨৴ʹ஗ԆΦʔόϔουΛ༩͑Δ ɾ՝୊2: Ұ୴ӬଓԽ͞Εͨ઀ଓΛ్த͔Βݕग़Ͱ͖ͳ͍

    ɾ՝୊3: ϗετ୯ҐͰͷ௥੻͸Ͱ͖Δ͕ϓϩηε୯ҐͰ͸Ͱ͖ͳ͍ ίωΫγϣϯϕʔεΞϓϩʔν [Clawson 15] [Clawson 15] J. K. Clawson, Service Dependency Analysis via TCP/UDP Port Tracing, Master’s thesis, Brigham Young University-Provo 2015.
  26. 26 ɾLayer3ͷύέοτΛ΋ͱʹґଘΛ௥੻͢Δ ɾطଘͷτϥώοΫ͔Βύ έοτΛऩू͠ɼύέοτϔομ্ͷૹ৴ ݩͱૹ৴ઌͷϗετͱϙʔτɼύέοτͷૹड৴ͷ࣌ࠁͳͲͷ৘ใ Λղੳ͢Δ ɾΫϥ΢υ্ͷ࣮؀ڥͰͷར༻ࣄྫ͸·ͩݟͨ͜ͱ͕ͳ͍ ɾৄࡉ͸লུ ύέοτϕʔεΞϓϩʔν P.

    Bahl, et.al.: Towards Highly Reliable Enterprise Network Services via Inference of Multi-Level Dependencies, ACM SIGCOMM Review, Vol. 37, No. 4, pp.13–24 2007. X. Chen, et.al.: Automating Network Application Dependency Discovery: Experiences, Limitations, and New Solutions, USENIX Symposium on OSDI, pp.117–130 2008. P. Lucian, etl.al.: Macroscope: End-Point Approach to Networked Application Dependency Discovery, CoNEXT, pp.229–240 2009. A. Natarajan, et.al.: NSDMiner: Automated Discovery of Network Service Dependencies, IEEE INFOCOM, pp. 2507–2515 2012. A. Zand, et.al.: Rippler: Delay Injection for Service Dependency Detection, IEEE INFOCOM, pp. 2157–2165 2014.
  27. 27 ɾܭଌ४උίετ ɾطଘͷίϯϙʔωϯτ಺ʹ௥ ੻ͷͨΊͷॲཧΛ௥Ճ͢Δखؒ ɾِӄੑ ɾखಈͰ௥੻ॲཧΛ௥Ճ͢Δͨ Ίɼ໢ཏੑʹ͚ܽΔ ɾܭଌ४උίετʹΑΓҰ෦ͷ ΈͷಋೖʹͳΓ͕ͪ ֤Ξϓϩʔνͷ՝୊੔ཧ

    ϦΫΤετϕʔεΞϓϩʔν ίωΫγϣϯϕʔεΞϓϩʔν ஗ԆΦʔόϔου: ௨৴ܦ࿏தʹ௥ՃͷॲཧΛڬΈࠐΉΦʔόϔου ɾِӄੑ ɾҰ୴ӬଓԽ͞Εͨ઀ଓΛ్ த͔Βݕग़Ͱ͖ͳ͍ ɾ௥੻୯Ґ͕ϓϩηεͰ͸ͳ͘ IPΞυϨε
  28. 3. Transtracerͷઃܭͱ࣮૷

  29. ઃܭ

  30. 30 1. ܭଌ४උίετͷ௿ݮ ɾίωΫγϣϯϕʔεΞϓϩʔνͷ࠾༻ 2. ϓϩηε୯ҐͰͷґଘؔ܎௥੻ ɾ઀ଓऴ୺఺Ͱ͋Διέοτ͔Β઀ଓ৘ใΛऔಘ 3. ઀ଓͷӬଓԽ؀ڥʹ͓͚Δِӄੑͷ௿ݮ ɾιέοτ͕อ࣋͢Δ઀ଓ৘ใΛϙʔϦϯά؂ࢹ

    4. ௿஗ԆΦʔόʔϔου ɾιέοτ؂ࢹ͸ϓϩηεॲཧͱ͸ಠཱ͍ͯ͠Δ Transtracerͷཁ݅ͱղܾ
  31. 31 TranstracerͷγεςϜߏ੒ Host 1 Host 2 Host N CMDB Tracer

    Tracer Tracer Systems Administrator ɾϗετ΍Pod্ʹTracerΤʔδΣϯτΛ ഑ஔ ɾ֤ΤʔδΣϯτ͸औಘͨ͠઀ଓ৘ใΛ CMDB(Connection Management DataBase)ʹอଘ ɾγεςϜ؅ཧऀ͸CMDBʹΞΫηε͠ɼ ෳ਺ͷϗετ΍Podʹ·͕ͨΓґଘؔ܎ Λऔಘ
  32. 32 TCPͷ઀ଓ৘ใͷऩू Host Kernel Process Process TCP/UDP Connections … Tracer

    Polling ɾTracerϓϩηε͕LinuxΧʔωϧʹ໰͍߹Θ ͤɼTCP/UDPιέοτ৘ใΛϙʔϦϯάऔಘ ɾ઀ଓΛऴ୺͢ΔOSϓϩηε৘ใ΋͋Θͤͯ औಘ ɾιέοτ৘ใ: /proc/net/tcp΍Netlink sock_diag ɾϓϩηε৘ใ: /proc/<pid>/{stat,fd} . . . ॲཧʹհೖ͠ͳ͍ͨΊ ௿Φʔόʔϔου
  33. 33 TCP઀ଓͷґଘͷํ޲ͷܾఆ Host Y Port N Process B CONNECT Host

    X Port M Process A LISTEN ɾ઀ଓΛཁٻ͢ΔϗετY͸ɼ઀ଓΛड͚෇͚ΔϗετXʹґଘ͢Δ ɾϗετY͔ΒΈͯѼઌϙʔτ͕LISTENϙʔτMͰ͋Ε͹ɼHost Y ͔Β઀ଓΛཁٻ͍ͯ͠Δ͜ͱ͕Θ͔Δ ɾLISTENϙʔτ͸ɼϗετXͷOSʹ໰͍߹Θͤͯऔಘ͢Δ
  34. 34 ɾ͢΂ͯͷ઀ଓ৘ใΛऩू͢ΔͱɼCMDBʹ֨ೲ͢Δσʔλྔ͕େ͖͘ ͳΔͨΊɼ৑௕ͳ৘ใΛ࡟ݮ͢Δ ɾΤϑΣϝϥϧϙʔτ: Χʔωϧ͔ΒׂΓ౰ͯΒΕΔϥϯμϜͳૹ৴ݩ ϙʔτ ɾಛఆͷLISTENϙʔτ΁ෳ਺ͷΤϑΣϝϥϧϙʔτ͔Β઀ଓ͞ΕΔ ɾ͜ΕΒͷϑϩʔΛू໿͠ɼ1ݸͷू໿ϑϩʔͱΈͳ͢ ΤϑΣϝϥϧϙʔτͷू໿ Host

    Port Process Port Port Host Port Process 1ݸͷϑϩʔ ͱͯ͠ू໿ LISTEN ΤϑΣϝϥϧ
  35. ࣮૷ https://github.com/yuuki/transtracer

  36. 36 ߏ੒ιϑτ΢ΣΞ ɾCMDB: PostgreSQL 11.3 ɾttracerd: TracerΤʔδΣϯτ ɾttctl: CMDBʹ໰͍߹ΘͤΔCLI Host/Pod

    PostgreSQL ttracerd ttctl CMDB Send trace data
  37. 37 Transtracerͷར༻ྫ $ ttctl --dbhost 10.0.0.20 --ipv4 10.0.0.10 10.0.0.10:80 (’nginx’,

    pgid=4656) ᵋ<-- 10.0.0.11:many (’wrk’, pgid=5982) 10.0.0.10:80 (’nginx’, pgid=4656) ᵋ--> 10.0.0.12:8080 (’python’, pgid=6111) 10.0.0.10:many (’fluentd’, pgid=2127) ᵋ--> 10.0.0.13:24224 (’fluentd’, pgid=2001) 10.0.0.10 nginx 10.0.0.11 wrk 10.0.0.12 python 10.0.0.13 fluentd :80 fluentd :8080 :24224
  38. 38 ϓϩηεͷσʔλߏ଄ ᶃ LinuxͷϓϩηεάϧʔϓΛ ϊʔυͷ࠷খ୯Ґͱ͢Δ ᵓᴷnginx,627,627 ᴹ ᵓᴷnginx,628,627 ᴹ ᵋᴷnginx,629,627

    $ pstree -apg | grep nginx ᶄ (machine-id, ipv4, pgid, pname) ͰϓϩηεʹҰҙ੍໿Λ͔͚Δ ϓϩηε͸࠶ىಈ͢ΔͱID͕มԽ ͯ͠͠·͏ͨΊɼ໰͍߹Θͤ࣌ʹ ͸pgid͕ҟͳΔ΋ͷΛॏෳഉআ IPΞυϨεͷ࠶ར༻ʹରԠ͢Δͨ Ίʹmachine-idΛར༻ (machine-id͸ະ࣮૷)
  39. 39 ઀ଓ؅ཧͷͨΊͷσʔλߏ଄ ᶅ ActiveͱPassiveʹϊʔυΛ෼ྨ ᶆ Active => PassiveͷϑϩʔΛอଘ Active Passive

    Activeଆͷϙʔτ͸ू໿ࡁΈ ͳͷͰอ࣋͠ͳ͍ (p.33) PassiveଆͷΈϦοεϯϙʔτ Λอ࣋ Process Passive Active Port N Port M Active ಉҰϓϩηε͕Activeʹ΋ Passiveʹ΋ͳΓ͑Δ ಉҰϓϩηε͕ෳ਺ͷϙʔτ ΛϦοεϯ͢Δ͜ͱ͕͋Δ
  40. 40 CMDBͷςʔϒϧεΩʔϚ(ൈਮ) ςʔϒϧ໊ Ωʔ આ໌ processes process_id ipv4 pgid pname

    ϢχʔΫ੍໿ ϓϩηεΛࣝผ͢ΔओΩʔ ϓϩηε͕ಈ࡞͢Δϗετ্ͷIPΞυϨε LinuxͷϓϩηεάϧʔϓID ϓϩηε໊ (ipv4, pgid, pname) active_nodes node_id process_id ϢχʔΫ੍໿ ϊʔυΛࣝผ͢ΔओΩʔ processesςʔϒϧʹର͢Δ֎෦Ωʔ (process_id) passive_nodes node_id process_id port ϢχʔΫ੍໿ ϊʔυΛࣝผ͢ΔओΩʔ processesςʔϒϧʹର͢Δ֎෦Ωʔ Ϧοεϯϙʔτ൪߸ (process_id, port) flows flow_id active_node_id passive_node_id ϢχʔΫ੍໿ ϊʔυಉ࢜ͷ઀ଓΛࣝผ͢ΔओΩʔ active_nodesςʔϒϧ΁ͷ֎෦Ωʔ passive_nodesςʔϒϧ΁ͷ֎෦Ωʔ (active_node_id, passive_node_id)
  41. 4. ࣮ݧ

  42. 42 1. ΞϓϦέʔγϣϯʹ༩͑ΔԠ౴஗ԆΦʔόʔϔουͷධՁ 2. ઀ଓ৘ใऔಘͷͨΊͷCPUར༻཰ΦʔόʔϔουͷධՁ ධՁ࣮ݧ

  43. 43 ࣮ݧ؀ڥ ߲໨ ࢓༷ Client CPU Memory Benchmarker Intel Xeon

    CPU E5-2650 v3 2.30GHz 2core 1 GB wrk 4.1.0-4 Server CPU Memory HTTP Server Intel Xeon CPU E5-2650 v3 2.30GHz 4core 1GB nginx 1.17.3 CMDB CPU Memory Database Intel Xeon CPU E5-2650 v3 2.30GHz 1core 1 GB PostgreSQL 11.3 ɾΠϯελϯε͸͢΂ͯ͘͞ΒͷΫϥ΢υ্ʹߏங ɾLinux Kernel 4.15 (Ubuntu Server 18.04.3 LTS)
  44. 44 ɾNormal: ௥੻ॲཧ͕ͳ͍ঢ়ଶ ɾTranstracer: TranstracerʹΑΔ௥੻ॲཧ͕૸͍ͬͯΔঢ়ଶ ɾiptables NEWϑΟϧλํࣜ: ৽ن઀ଓͷΈϩάΛग़ྗ ɾ-I INPUT

    -m state --state NEW -m limit -j TRACE-LOG ɾiptables ESTBϑΟϧλํࣜ: ઀ଓཱ֬தʹ΍ΓͱΓ͞ΕΔύέοτ ͷϩάΛ͢΂ͯग़ྗ ɾ-I INPUT -m state --state ESTABLISHED -m limit -j TRACE-LOG ֤࣮૷ͷઃఆ
  45. 45 Ԡ౴஗ԆΦʔόʔϔου 50 100 150 200 250 300 350 400

    450 500 5000 10000 15000 20000 Average Latency (ms) Connections Normal 93.1 191.6 279.3 353.8 Transtracer 94.7 188.3 291.8 401.2 ESTB filter 115.0 236.0 359.0 462.5 NEW filter 113.1 214.4 310.0 449.3 ɾNormalʹରͯ͠transtracer͕ 1.7~13.4%ͷΦʔόϔου૿Ճ ɾiptables࣮૷ͷESTBϑΟϧλํࣜ ʹରͯ͠ɼtranstracer͕13-20% ͷΦʔόϔουݮগ
  46. 46 CPUར༻཰Φʔόʔϔου 0 10 20 30 40 50 60 70

    80 90 100 5000 10000 15000 20000 0 50 100 150 200 250 300 350 400 450 500 CPU usage (%) Reading sockets time(ms) Connections ttracerd’s CPU usage 13.2 23.0 34.2 44.4 ESTB filter’s CPU usage 72.2 75.9 78.8 78.6 Reading sockets time 102.3 199.1 317.8 408.6 ɾ20,000઀ଓʹ͓͍ͯɼTranstracer ͷCPUར༻཰44.4%ɼESTBϑΟϧ λํࣜͷCPUར༻཰͸78.6% ɾ43.5%ͷCPUར༻཰ͷ௿ݮ ɾNEWϑΟϧλํࣜ͸઀ଓཱ֬࣌ͷ ΈCPUΛར༻͢ΔͨΊ༗ར
  47. 47 ɾ20,000઀ଓ࣌ʹϙʔϦϯάؒ ִΛ૿Ճͤ͞ΔͱCPUར༻཰ ͕Ͳͷఔ౓௿ݮ͞ΕΔ͔ ɾ5ඵҎ಺ͷ୹໋ͳ઀ଓΛݕग़Ͱ ͖ͳ͘ͳΔՄೳੑ͕͋Δ͔Θ Γʹ8.6%·Ͱ௿ݮՄೳ ɾHTTP/2ͷΑ͏ʹ઀ଓΛӬଓԽ ͢Δ؀ڥͰ͸୹໋ͳ઀ଓΛݟ ಀͯ͠΋໰୊ʹͳΒͳ͍

    ϙʔϦϯάִؒͱCPUར༻཰ͷؔ܎ 0 5 10 15 20 25 30 35 40 45 50 55 1 2 3 4 5 CPU usage (%) Polling interval CPU usage 44.4 21.6 13.0 10.8 8.6
  48. 5. ࠓޙͷల๬

  49. 49 KubernetesରԠ ɾPodͷαΠυΧʔͱͯ͠TracerϓϩηεΛ഑ஔ͢Δ ɾPodؒ઀ଓ΍Ϋϥελ֎઀ଓͰɼServiceϦιʔεΛར༻͢Δ৔߹ͷ NATରԠ ɾLinuxͷconntrackͷςʔϒϧΛεΩϟϯͯ͠ରԠ෇͚ Pod Pod Endpoint Tracer

    ίϯςφ NAT
  50. 50 ɾϙʔϦϯά͕ݟಀ͢୹໋ͳ ઀ଓΛݕग़͢ΔͨΊʹετ ϦʔϛϯάΛ૊Έ߹ΘͤΔ ɾeBPFʹΑΓɼconnect(2)ͱ accept(2)ΠϕϯτΛऔಘ ͠ɼϑϩʔ৘ใΛऔಘ͢Δ ɾUDPͷ৔߹͸send_msg(2), recv_msg(2)Πϕϯτ ετϦʔϛϯάʹΑΔ઀ଓͷݕग़

    Linux Host Kernel Process Process TCP/UDP Flows … . . . User Streaming Tracer
  51. 51 ɾiovisor/bcc಺ͷtcpacceptͰɼඇӬଓԽ؀ڥͰͷෛՙ࣮ݧ ɾwrk (HTTP KeepAlive off)Ͱಉ࣌઀ଓ1000ͰnginxʹϕϯνϚʔΫ ɾCPUར༻཰͸45~50%/coreఔ౓ ɾԠ౴஗ԆͷΦʔόϔου͸༗ҙͳѱԽ͸ݟΒΕͳ͔ͬͨ ɾeBPFͷΠϕϯτΛ͢΂ͯϢʔβʔϥϯυʹίϐʔ͍ͯ͠Δ͜ͱ͕ CPUෛՙ͕ߴ͍ཁҼ͔΋͠Εͳ͍

    eBPFͷετϦʔϛϯάෛՙͷ༧උ࣮ݧ ৄࡉ͸࣮ݧϊʔτ΁ https://www.notion.so/yuuk1/iovisor-bcc-tcpconnect-tcpaccept-af2d1fdce35c49fb945b548db373213d
  52. 52 ɾCMDBͰ͸ͳ͘ɼ֤ΤʔδΣϯτͷϩʔΧϧྖҬʹࣗ਎ͷϑϩʔ σʔλΛอ࣋͢Δ ɾ೚ҙͷΤʔδΣϯτʹ໰͍߹ΘͤɼཁٻʹԠͨ͡ϑϩʔσʔλΛ֤ ΤʔδΣϯτ͔Β෼ࢄऔಘ͢Δ ɾऔಘ࣌ʹ֤ΤʔδΣϯτ͸ࣗ਎ͷϗετ্ͷґଘؔ܎ʹ͕͍ͨ͠ɼ ྡͷϗετ্ͷΤʔδΣϯτʹ໰͍߹ΘͤΔ CMDBͷηοτΞοϓίετΛ࡟ݮ

  53. 6. ·ͱΊ

  54. 54 ɾγεςϜมߋ࣌ʹ৴པੑͷϦεΫΛ༧ଌ͢Δ͜ͱΛ໨తʹϓϩηε ؒґଘؔ܎Λ௥੻πʔϧ Transtracer Λ঺հͨ͠ ɾαʔό΍ΞϓϦέʔγϣϯʹ༩͑ΔΦʔόʔϔουΛ௿ݮͤͭ͞ ͭɼϓϩηεؒͷґଘؔ܎Λ໢ཏతʹݕग़Ͱ͖Δ ɾͨͩ͠ɼ਺ඵҎ಺ͷ୹໋઀ଓΛݟಀ͢Մೳੑ͕͋Δ ɾࠓޙ͸͢΂ͯͷ઀ଓϑϩʔΛ໢ཏ͢ΔલఏͰͲΕ͚ͩΦʔόʔϔο υΛ௿ݮ͍͔ͤͯ͘͞Λݚڀ։ൃ͍ͯ͘͠

    ·ͱΊ
  55. TranstracerͰղܾͰ͖ͦ͏ͳ՝୊ ͕͋Γ·ͨ͠Β ϑΟʔυόοΫ͍͚ͨͩΔͱتͼ·͢ Thank id:masayoshi / @yoyogidesaiz for their ideas

    related with this presentation.