Upgrade to Pro — share decks privately, control downloads, hide ads and more …

分散システム内の関係性に着目したObservabilityツール / Observability tool focused on relationship in distributed systems

分散システム内の関係性に着目したObservabilityツール / Observability tool focused on relationship in distributed systems

ゆううきが開発しているlstfやtranstracerなどのツールを最近のObservabilityの流れから紹介した話です。

Kyoto.なんか #5, https://kyoto-nanka.connpass.com/event/141982/, 2019年8月24日.

Yuuki Tsubouchi (yuuk1)

August 24, 2019
Tweet

More Decks by Yuuki Tsubouchi (yuuk1)

Other Decks in Research

Transcript

  1. 3

  2. External DNS Server Application flow DNS flow RDB server Application

    server Web server Internal DNS server Full text search server KVS server Message queue server Batch server Application server ͜͜10೥͘Β͍ͷ෼ࢄγεςϜ
  3. Log collector agent Main network process Monitoring agent Proxy User

    Authentication ୯Ұϗετ্ͷ༷ࢠ DNS forwarder
  4. TCP/UDPͷ઀ଓΠϕϯτͷऩू 18 Host Kernel Process Process Transport … Tracer Polling

    ɾTracerϓϩηε͕LinuxΧʔωϧʹ໰͍߹Θ ͤɼTCP/UDPιέοτ৘ใΛϙʔϦϯάऔಘ ɾ઀ଓΛऴ୺͢ΔOSϓϩηε৘ใ΋͋Θͤͯ औಘ ɾιέοτ৘ใ: /proc/net/tcp΍Netlink sock_diag ɾϓϩηε৘ใ: /proc/<pid>/{stat,fd} . . . ॲཧʹհೖ͠ͳ͍ͨΊ ௿Φʔόʔϔου
  5. lstf 21 $ lstf -n Local Address:Port <--> Peer Address:Port

    Connections 10.0.1.9:many --> 10.0.1.10:3306 22 10.0.1.9:many --> 10.0.1.11:3306 14 10.0.2.10:22 <-- 192.168.10.10:many 1 10.0.1.9:80 <-- 10.0.2.13:many 120 10.0.1.9:80 <-- 10.0.2.14:many 202
  6. Ͳͷϓϩηεͱ௨৴͍ͯ͠Δ͔΋Θ͔Δ 22 $ lstf -n —process Local Address:Port <--> Peer

    Address:Port Connections Process 10.0.1.9:many --> 10.0.1.10:3306 22 {“mysqld”,pgid=6342} 10.0.1.9:many --> 10.0.1.11:3306 14 {“mysqld”,pgid=9398} 10.0.2.10:22 <-- 192.168.10.10:many 1 {“sshd”, pgid=27027} 10.0.1.9:80 <-- 10.0.2.13:many 120 {“unicorn”, pgid=3790} 10.0.1.9:80 <-- 10.0.2.14:many 202 {“unicorn”, pgid=3790}
  7. 1. TCP઀ଓΠϕϯτΛͲ͏΍ͬͯऔಘ͢Δͷ͔ʁ 26 • /proc/net/tcp͔Βऔಘ • procfsͱ͍͏ϑΝΠϧγεςϜܗࣜͰΧʔωϧ͔Β৘ใΛͱΔ • https://github.com/shirou/gopsutil Λར༻

    • Netlink API͔Βऔಘ • ιέοτܗࣜͰΧʔωϧ͔Β৘ใΛͱΔ ߴ଎ • Socket Monitoring Interface • github.com/elastic/gosigar/sys/linuxΛར༻
  8. Procfs vs Netlink 27 • ໿40,000઀ଓ͋ΔWebαʔό্ʹͯɼlstfίϚϯυͷ࣮ߦ࣌ؒΛ໊લ ղܾ࣌ؒΛؚ·ͣʹൺֱ • EC2ͷc4.2xlargeɺDebian 8.10ɺLinuxΧʔωϧ3.16

    • 500ms(procfs) => 300ms(netlink) ΁ • Netlink࣮૷ͷ΄͏͕1.6ഒ͸΍͍ https://memo.yuuk.io/entry/2018/06/18/003157
  9. 2. TCP઀ଓͷํ޲ΛͲ͏΍ͬͯࣝผ͢Δͷ͔ʁ 28 Host Y Port N Process B CONNECT

    Host X Port M Process A LISTEN ɾ઀ଓΛཁٻ͢ΔϗετY͸ɼ઀ଓΛड͚෇͚ΔϗετXʹґଘ͢Δ ɾϗετY͔ΒΈͯѼઌϙʔτ͕LISTENϙʔτMͰ͋Ε͹ɼHost Y ͔Β઀ଓΛཁٻ͍ͯ͠Δ͜ͱ͕Θ͔Δ ɾLISTENϙʔτ͸ɼϗετXͷOSʹ໰͍߹Θͤͯऔಘ͢Δ
  10. 4. TCP઀ଓΠϕϯτͱϓϩηεͷඥ෇͚ 30 • procfsͱ͔netlink͔Βͷ઀ଓΠϕϯτʹϓϩηεͷ৘ใ͕ͳ͍ • ͔ΘΓʹ઀ଓΠϕϯτʹ͸ιέοτͷinode৘ใ͸͋Δ • /proc/<pid>/fd ҎԼ͔Βinode৘ใ͸ͱΕΔ

    • ϓϩηεϦετͱɼ઀ଓΠϕϯτϦετͷ2ͭͷinodeΛΩʔʹ݁߹ • Nested Loop݁߹ https://memo.yuuk.io/entry/2019/linux-process-and-connection
  11. γεςϜߏ੒ 36 Host 1 Host 2 Host N Postgres Tracer

    Tracer Tracer Systems Administrator ɾϗετ্ʹTracerΤʔδΣϯτΛ഑ஔ ɾ֤TracerΤʔδΣϯτ͸औಘͨ͠઀ଓ৘ ใΛPostgreSQLʹอଘ ɾγεςϜ؅ཧऀ͸PostgresʹΞΫηε ͠ɼෳ਺ͷϗετʹ·͕ͨΓґଘؔ܎Λ औಘ
  12. ੍໿ 38 • TCP/UDP૚ͷ৘ใͳͷͰɼHTTPͳͲͷL7ϓϩτίϧͷ৘ใ͕ Θ͔Βͳ͍ • ϦΫΤετύε΍ϦΫΤετ୯ҐͷԠ౴଎౓ͳͲ • ผͷπʔϧͱ૊Έ߹ΘͤΔ͜ͱʹͳΔ •

    ϑΥϫʔυϓϩΩγ΍NATͷΑ͏ͳதܧऀΛڬΜͩͱ͖ͷґଘ ؔ܎Λ௥੻Ͱ͖ͳ͘ͳΔ • NAT: NATͷઌΛޓ͍ʹೝࣝͰ͖ͳ͍ • ϓϩΩγ: ࣮ࡍͷґଘΑΓ΋ଟ͘ͷґଘ͕͋ΔΑ͏ʹΈ͑Δ
  13. ·ͱΊ 40 • ໰୊ҙࣝ: ෼ࢄγεςϜͷґଘؔ܎͕ෳࡶԽͯ͠มߋ଎౓͕௿Լ • ໨త: ؔ܎ੑʹண໨ͨ͠Obserbabilityͷ֬อ • ՝୊:

    ϚΠΫϩαʔϏεؒҎ֎ͷ௨৴ͷґଘΛ௥੻͢Δπʔϧ͸ ·ͩͳ͍ • ղܾ: LinuxͷTCP/UDP૚ͷ઀ଓΠϕϯτΛ࢖ͬͯɼܰྔτϨʔ γϯά