Upgrade to Pro — share decks privately, control downloads, hide ads and more …

分散システム内の関係性に着目したObservabilityツール / Observability tool focused on relationship in distributed systems

分散システム内の関係性に着目したObservabilityツール / Observability tool focused on relationship in distributed systems

ゆううきが開発しているlstfやtranstracerなどのツールを最近のObservabilityの流れから紹介した話です。

Kyoto.なんか #5, https://kyoto-nanka.connpass.com/event/141982/, 2019年8月24日.

Yuuki Tsubouchi (yuuk1)

August 24, 2019
Tweet

More Decks by Yuuki Tsubouchi (yuuk1)

Other Decks in Research

Transcript

  1. ෼ࢄγεςϜ಺ͷؔ܎ੑʹண໨ͨ͠
    Observabilityπʔϧ
    id:y_uuki / @yuuk1t
    Kyoto.ͳΜ͔ #5, 2019.08.24

    View full-size slide

  2. ࣗݾ঺հ
    2
    https://yuuk.io/
    @yuuk1t
    id:y_uuki
    ݩ͸ͯͳΠϯλʔϯ (2011೥)
    ݩ͸ͯͳΤϯδχΞ
    ͘͞ΒΠϯλʔωοτݚڀॴ

    View full-size slide

  3. 4
    Line OpenChat Ώ͏͏͖ϥϘ

    View full-size slide

  4. 5
    ෼ࢄγεςϜ΍ͬͯ·͔͢

    View full-size slide

  5. 6
    WebαʔϏεͷ෼ࢄγεςϜ

    View full-size slide

  6. External
    DNS Server
    Application flow
    DNS flow
    RDB
    server
    Application
    server
    Web
    server
    Internal
    DNS server
    Full text
    search server
    KVS server
    Message
    queue server
    Batch
    server
    Application
    server
    ͜͜10೥͘Β͍ͷ෼ࢄγεςϜ

    View full-size slide

  7. Log collector
    agent
    Main
    network process
    Monitoring
    agent
    Proxy
    User
    Authentication
    ୯Ұϗετ্ͷ༷ࢠ
    DNS
    forwarder

    View full-size slide

  8. ෳ਺ͷγεςϜ͕݁߹͢ΔΞʔΩςΫνϟ
    ݁߹ͷํࣜͷҰ͕ͭϚΠΫϩαʔϏε
    Message Queue
    Reverse Proxy

    View full-size slide

  9. ෼ࢄγεςϜͷґଘؔ܎͕ະ஌
    10
    • ෼ࢄγεςϜͰ͸ɼ୭͕୭ʹґଘ͍ͯ͠Δ͔ɼਓؒͷهԱʹཔΓ͕ͪ
    • ෳࡶԽ͗ͯ͢͠هԱͨ͠ΓɼυΩϡϝϯτ͖͠Εͳ͘ͳ͍ͬͯΔ

    View full-size slide

  10. ґଘؔ܎͕ະ஌Ͱ͋Δ͜ͱͷ໰୊ҙࣝ
    11
    • ͋ΔίϯϙʔωϯτΛมߋ͢Δͱ͖ʹɼมߋͷӨڹൣғ͕෼͔Βͳ͍
    • ௐࠪʹ࣌ؒΛ͔͚Δ͔ɼͦ΋ͦ΋ఘΊΔ
    • ো֐ൃੜ࣌ʹɼ૬ؔؔ܎΍ҼՌؔ܎͕Θ͔Βͳ͍
    • ֤छϝτϦοΫ΍ϩάΛ͕Μ͹͖ͬͯͭ͋ΘͤΔ

    View full-size slide

  11. 12
    ෼ࢄγεςϜ಺ͷཁૉಉ࢜ͷ
    ؔ܎ੑʹண໨ͨ͠
    Observability

    View full-size slide

  12. Observability
    13
    • Մ؍ଌੑͱ͔ɼ؍ଌՄೳੑͱ͔
    • γεςϜ͕࣮؀ڥͰͲͷΑ͏ʹಈ࡞͍ͯ͠Δ͔Λ֎෦͔Β஌Δ͜ͱ͕
    Ͱ͖Δೳྗ
    • ݱࡏͰ͸ɼLog, Trace, MetricsΛσʔλιʔεͱͯ͠ར༻ͯ͠
    ObservabilityΛୡ੒͍ͯ͠Δ
    -PHHJOH 5SBDJOH .FUSJDT

    View full-size slide

  13. ؔ܎ੑʹண໨ͨ͠Observability
    14
    • ෼ࢄτϨʔγϯάʹΑΓɼϚΠΫϩαʔϏεؒͷؔ܎ੑ΍ϦΫΤετ
    ͷॱংɼ֤ܦ࿏ͰͷԠ౴଎౓ͳͲΛՄࢹԽͰ͖Δ
    • ΞϓϦέʔγϣϯʹ௨৴ϩάΛు͔ͤͯऩू͢Δ
    • αʔϏεϝογϡͰαΠυΧʔϓϩΩγ͕ϩάΛ೺ѲͰ͖Δ

    View full-size slide

  14. HTTPϕʔεͷϚΠΫϩαʔϏεؒҎ֎ͷؔ܎ੑ͸ʁ
    15
    • HTTPҎ֎ͷ༷ʑͳϓϩτίϧʹϓϩΩγͰରԠ͢Δͷ΋େม
    • ࣗ෼Ͱ։ൃ͍ͯ͠ΔΘ͚Ͱ͸ͳ͍ϛυϧ΢ΣΞͷίʔυʹϩΪϯάͷ
    ͨΊͷίʔυΛຒΊࠐΉͷ͸େม

    View full-size slide

  15. 16
    ൚༻తͰܰྔͳτϨʔγϯάΛ
    ߟ͑Δ

    View full-size slide

  16. Lightweight Traceability
    17
    LinuxΧʔωϧͷTCP/UDP૚Ͱ൚༻తʹτϨʔγϯά
    • TCP/UDP઀ଓཱ֬ͨ͠ͱ͖ͷΠϕϯτ͚ͩ௥੻͢Ε͹Α͍
    • Πϕϯτʹ͸ѼઌIPΞυϨεͱϙʔτɼૹ৴ઌIPΞυϨεͱϙʔ
    τͷ૊͕͋Δ

    View full-size slide

  17. TCP/UDPͷ઀ଓΠϕϯτͷऩू
    18
    Host
    Kernel
    Process Process
    Transport

    Tracer
    Polling
    ɾTracerϓϩηε͕LinuxΧʔωϧʹ໰͍߹Θ
    ͤɼTCP/UDPιέοτ৘ใΛϙʔϦϯάऔಘ
    ɾ઀ଓΛऴ୺͢ΔOSϓϩηε৘ใ΋͋Θͤͯ
    औಘ
    ɾιέοτ৘ใ: /proc/net/tcp΍Netlink sock_diag
    ɾϓϩηε৘ใ: /proc//{stat,fd}
    .
    .
    .
    ॲཧʹհೖ͠ͳ͍ͨΊ
    ௿Φʔόʔϔου

    View full-size slide

  18. 19
    ܧଓతʹτϨʔγϯά͢Δલʹ
    ·ͣ͸ॠؒతʹ৘ใΛදࣔ͢Δ
    CLIπʔϧΛͭͬͨ͘

    View full-size slide

  19. github.com/yuuki/lstf
    20

    View full-size slide

  20. lstf
    21
    $ lstf -n
    Local Address:Port <--> Peer Address:Port Connections
    10.0.1.9:many --> 10.0.1.10:3306 22
    10.0.1.9:many --> 10.0.1.11:3306 14
    10.0.2.10:22 <-- 192.168.10.10:many 1
    10.0.1.9:80 <-- 10.0.2.13:many 120
    10.0.1.9:80 <-- 10.0.2.14:many 202

    View full-size slide

  21. Ͳͷϓϩηεͱ௨৴͍ͯ͠Δ͔΋Θ͔Δ
    22
    $ lstf -n —process
    Local Address:Port <--> Peer Address:Port Connections Process
    10.0.1.9:many --> 10.0.1.10:3306 22 {“mysqld”,pgid=6342}
    10.0.1.9:many --> 10.0.1.11:3306 14 {“mysqld”,pgid=9398}
    10.0.2.10:22 <-- 192.168.10.10:many 1 {“sshd”, pgid=27027}
    10.0.1.9:80 <-- 10.0.2.13:many 120 {“unicorn”, pgid=3790}
    10.0.1.9:80 <-- 10.0.2.14:many 202 {“unicorn”, pgid=3790}

    View full-size slide

  22. 23
    Demo
    ISUCON4༧બ
    benchmarker web app db

    View full-size slide

  23. 24
    ࣮૷ํ๏

    View full-size slide

  24. ࣮૷ͷϙΠϯτ
    25
    1. TCP઀ଓΠϕϯτΛͲ͏΍ͬͯऔಘ͢Δͷ͔ʁ
    2. TCP઀ଓͷํ޲ΛͲ͏΍ͬͯೝࣝ͢Δͷ͔ʁ
    3. TCP઀ଓΠϕϯτͷू໿ͱ͸ͳʹ͔ʁ
    4. TCP઀ଓΠϕϯτͱϓϩηεΛͲ͏΍ͬͯඥ෇͚Δͷ͔ʁ

    View full-size slide

  25. 1. TCP઀ଓΠϕϯτΛͲ͏΍ͬͯऔಘ͢Δͷ͔ʁ
    26
    • /proc/net/tcp͔Βऔಘ
    • procfsͱ͍͏ϑΝΠϧγεςϜܗࣜͰΧʔωϧ͔Β৘ใΛͱΔ
    • https://github.com/shirou/gopsutil Λར༻
    • Netlink API͔Βऔಘ
    • ιέοτܗࣜͰΧʔωϧ͔Β৘ใΛͱΔ ߴ଎
    • Socket Monitoring Interface
    • github.com/elastic/gosigar/sys/linuxΛར༻

    View full-size slide

  26. Procfs vs Netlink
    27
    • ໿40,000઀ଓ͋ΔWebαʔό্ʹͯɼlstfίϚϯυͷ࣮ߦ࣌ؒΛ໊લ
    ղܾ࣌ؒΛؚ·ͣʹൺֱ
    • EC2ͷc4.2xlargeɺDebian 8.10ɺLinuxΧʔωϧ3.16
    • 500ms(procfs) => 300ms(netlink) ΁
    • Netlink࣮૷ͷ΄͏͕1.6ഒ͸΍͍
    https://memo.yuuk.io/entry/2018/06/18/003157

    View full-size slide

  27. 2. TCP઀ଓͷํ޲ΛͲ͏΍ͬͯࣝผ͢Δͷ͔ʁ
    28
    Host Y
    Port N Process B
    CONNECT
    Host X
    Port M
    Process A
    LISTEN
    ɾ઀ଓΛཁٻ͢ΔϗετY͸ɼ઀ଓΛड͚෇͚ΔϗετXʹґଘ͢Δ
    ɾϗετY͔ΒΈͯѼઌϙʔτ͕LISTENϙʔτMͰ͋Ε͹ɼHost Y
    ͔Β઀ଓΛཁٻ͍ͯ͠Δ͜ͱ͕Θ͔Δ
    ɾLISTENϙʔτ͸ɼϗετXͷOSʹ໰͍߹Θͤͯऔಘ͢Δ

    View full-size slide

  28. 3. TCP઀ଓΠϕϯτͷू໿ͱ͸ͳʹ͔ʁ
    29
    ɾΤϑΣϝϥϧϙʔτͷ৘ใ͸ґଘ೺Ѳʹ͸༨෼ͳͷͰू໿
    ɾΤϑΣϝϥϧϙʔτ: Χʔωϧ͔ΒׂΓ౰ͯΒΕΔϥϯμϜͳૹ৴ݩ
    ϙʔτ
    ɾಛఆͷLISTENϙʔτ΁ෳ਺ͷΤϑΣϝϥϧϙʔτ͔Β઀ଓ͞ΕΔ
    ɾ͜ΕΒͷ઀ଓΛू໿͠ɼ1ݸͷ઀ଓͱΈͳ͢
    Host
    Port
    Process Port
    Port
    Host
    Port Process
    1ݸͷ઀ଓ

    ͱͯ͠ू໿
    LISTEN
    ΤϑΣϝϥϧ

    View full-size slide

  29. 4. TCP઀ଓΠϕϯτͱϓϩηεͷඥ෇͚
    30
    • procfsͱ͔netlink͔Βͷ઀ଓΠϕϯτʹϓϩηεͷ৘ใ͕ͳ͍
    • ͔ΘΓʹ઀ଓΠϕϯτʹ͸ιέοτͷinode৘ใ͸͋Δ
    • /proc//fd ҎԼ͔Βinode৘ใ͸ͱΕΔ
    • ϓϩηεϦετͱɼ઀ଓΠϕϯτϦετͷ2ͭͷinodeΛΩʔʹ݁߹
    • Nested Loop݁߹
    https://memo.yuuk.io/entry/2019/linux-process-and-connection

    View full-size slide

  30. 31
    skb->sk->socket->file->f_owner->pid

    View full-size slide

  31. ؔ܎ੑʹج͍ͮͨpingϞχλϦϯά
    33
    ICMP/TCP/HTTP
    ݱࡏ ͜͏͍ͨ͠
    ࣮ࡍͷ௨৴ܦ࿏Λ
    ؂ࢹ͍ͨ͠
    ICMP/TCP/HTTP

    View full-size slide

  32. lstfͰಈతʹ௨৴ઌΛࣝผ
    34
    • lstfͷΑ͏ͳػߏͳΒ௨৴ઌΛಈతʹऔಘͰ͖Δ
    • ෦෼ωοτϫʔΫো֐ൃੜ࣌ʹdeadmanͱ૊Έ߹ΘͤΔ
    • github.com/upa/deadman ͸pingͷTUI؂ࢹπʔϧ
    • deadman͸؂ࢹઌΛ੩తʹઃఆ͢ΔͷͰɼlstfͰಈతੜ੒͢Δ
    • ؂ࢹΤʔδΣϯτʹ૊ΈࠐΜͰৗ࣌؂ࢹ

    View full-size slide

  33. 35
    Transtracer(WIP)
    github.com/yuuki/transtracer

    View full-size slide

  34. γεςϜߏ੒
    36
    Host 1
    Host 2
    Host N
    Postgres
    Tracer
    Tracer
    Tracer
    Systems
    Administrator
    ɾϗετ্ʹTracerΤʔδΣϯτΛ഑ஔ
    ɾ֤TracerΤʔδΣϯτ͸औಘͨ͠઀ଓ৘
    ใΛPostgreSQLʹอଘ
    ɾγεςϜ؅ཧऀ͸PostgresʹΞΫηε
    ͠ɼෳ਺ͷϗετʹ·͕ͨΓґଘؔ܎Λ
    औಘ

    View full-size slide

  35. 37
    ख๏ͷ੍໿

    View full-size slide

  36. ੍໿
    38
    • TCP/UDP૚ͷ৘ใͳͷͰɼHTTPͳͲͷL7ϓϩτίϧͷ৘ใ͕
    Θ͔Βͳ͍
    • ϦΫΤετύε΍ϦΫΤετ୯ҐͷԠ౴଎౓ͳͲ
    • ผͷπʔϧͱ૊Έ߹ΘͤΔ͜ͱʹͳΔ
    • ϑΥϫʔυϓϩΩγ΍NATͷΑ͏ͳதܧऀΛڬΜͩͱ͖ͷґଘ
    ؔ܎Λ௥੻Ͱ͖ͳ͘ͳΔ
    • NAT: NATͷઌΛޓ͍ʹೝࣝͰ͖ͳ͍
    • ϓϩΩγ: ࣮ࡍͷґଘΑΓ΋ଟ͘ͷґଘ͕͋ΔΑ͏ʹΈ͑Δ

    View full-size slide

  37. ·ͱΊ
    40
    • ໰୊ҙࣝ: ෼ࢄγεςϜͷґଘؔ܎͕ෳࡶԽͯ͠มߋ଎౓͕௿Լ
    • ໨త: ؔ܎ੑʹண໨ͨ͠Obserbabilityͷ֬อ
    • ՝୊: ϚΠΫϩαʔϏεؒҎ֎ͷ௨৴ͷґଘΛ௥੻͢Δπʔϧ͸
    ·ͩͳ͍
    • ղܾ: LinuxͷTCP/UDP૚ͷ઀ଓΠϕϯτΛ࢖ͬͯɼܰྔτϨʔ
    γϯά

    View full-size slide

  38. ࠓޙͷ՝୊
    41
    • ϙʔϦϯάͱΠϕϯτ௨஌Λ૊Έ߹Θͤͯਫ਼౓޲্
    • eBPFͰconnect(2)ͱaccept(2)ͰΠϕϯτ௨஌
    • ίϯςφରԠ

    View full-size slide