Slide 1

Slide 1 text

෼ࢄγεςϜ಺ͷؔ܎ੑʹண໨ͨ͠ Observabilityπʔϧ id:y_uuki / @yuuk1t Kyoto.ͳΜ͔ #5, 2019.08.24

Slide 2

Slide 2 text

ࣗݾ঺հ 2 https://yuuk.io/ @yuuk1t id:y_uuki ݩ͸ͯͳΠϯλʔϯ (2011೥) ݩ͸ͯͳΤϯδχΞ ͘͞ΒΠϯλʔωοτݚڀॴ

Slide 3

Slide 3 text

3

Slide 4

Slide 4 text

4 Line OpenChat Ώ͏͏͖ϥϘ

Slide 5

Slide 5 text

5 ෼ࢄγεςϜ΍ͬͯ·͔͢

Slide 6

Slide 6 text

6 WebαʔϏεͷ෼ࢄγεςϜ

Slide 7

Slide 7 text

External DNS Server Application flow DNS flow RDB server Application server Web server Internal DNS server Full text search server KVS server Message queue server Batch server Application server ͜͜10೥͘Β͍ͷ෼ࢄγεςϜ

Slide 8

Slide 8 text

Log collector agent Main network process Monitoring agent Proxy User Authentication ୯Ұϗετ্ͷ༷ࢠ DNS forwarder

Slide 9

Slide 9 text

ෳ਺ͷγεςϜ͕݁߹͢ΔΞʔΩςΫνϟ ݁߹ͷํࣜͷҰ͕ͭϚΠΫϩαʔϏε Message Queue Reverse Proxy

Slide 10

Slide 10 text

෼ࢄγεςϜͷґଘؔ܎͕ະ஌ 10 • ෼ࢄγεςϜͰ͸ɼ୭͕୭ʹґଘ͍ͯ͠Δ͔ɼਓؒͷهԱʹཔΓ͕ͪ • ෳࡶԽ͗ͯ͢͠هԱͨ͠ΓɼυΩϡϝϯτ͖͠Εͳ͘ͳ͍ͬͯΔ

Slide 11

Slide 11 text

ґଘؔ܎͕ະ஌Ͱ͋Δ͜ͱͷ໰୊ҙࣝ 11 • ͋ΔίϯϙʔωϯτΛมߋ͢Δͱ͖ʹɼมߋͷӨڹൣғ͕෼͔Βͳ͍ • ௐࠪʹ࣌ؒΛ͔͚Δ͔ɼͦ΋ͦ΋ఘΊΔ • ো֐ൃੜ࣌ʹɼ૬ؔؔ܎΍ҼՌؔ܎͕Θ͔Βͳ͍ • ֤छϝτϦοΫ΍ϩάΛ͕Μ͹͖ͬͯͭ͋ΘͤΔ

Slide 12

Slide 12 text

12 ෼ࢄγεςϜ಺ͷཁૉಉ࢜ͷ ؔ܎ੑʹண໨ͨ͠ Observability

Slide 13

Slide 13 text

Observability 13 • Մ؍ଌੑͱ͔ɼ؍ଌՄೳੑͱ͔ • γεςϜ͕࣮؀ڥͰͲͷΑ͏ʹಈ࡞͍ͯ͠Δ͔Λ֎෦͔Β஌Δ͜ͱ͕ Ͱ͖Δೳྗ • ݱࡏͰ͸ɼLog, Trace, MetricsΛσʔλιʔεͱͯ͠ར༻ͯ͠ ObservabilityΛୡ੒͍ͯ͠Δ -PHHJOH 5SBDJOH .FUSJDT

Slide 14

Slide 14 text

ؔ܎ੑʹண໨ͨ͠Observability 14 • ෼ࢄτϨʔγϯάʹΑΓɼϚΠΫϩαʔϏεؒͷؔ܎ੑ΍ϦΫΤετ ͷॱংɼ֤ܦ࿏ͰͷԠ౴଎౓ͳͲΛՄࢹԽͰ͖Δ • ΞϓϦέʔγϣϯʹ௨৴ϩάΛు͔ͤͯऩू͢Δ • αʔϏεϝογϡͰαΠυΧʔϓϩΩγ͕ϩάΛ೺ѲͰ͖Δ

Slide 15

Slide 15 text

HTTPϕʔεͷϚΠΫϩαʔϏεؒҎ֎ͷؔ܎ੑ͸ʁ 15 • HTTPҎ֎ͷ༷ʑͳϓϩτίϧʹϓϩΩγͰରԠ͢Δͷ΋େม • ࣗ෼Ͱ։ൃ͍ͯ͠ΔΘ͚Ͱ͸ͳ͍ϛυϧ΢ΣΞͷίʔυʹϩΪϯάͷ ͨΊͷίʔυΛຒΊࠐΉͷ͸େม

Slide 16

Slide 16 text

16 ൚༻తͰܰྔͳτϨʔγϯάΛ ߟ͑Δ

Slide 17

Slide 17 text

Lightweight Traceability 17 LinuxΧʔωϧͷTCP/UDP૚Ͱ൚༻తʹτϨʔγϯά • TCP/UDP઀ଓཱ֬ͨ͠ͱ͖ͷΠϕϯτ͚ͩ௥੻͢Ε͹Α͍ • Πϕϯτʹ͸ѼઌIPΞυϨεͱϙʔτɼૹ৴ઌIPΞυϨεͱϙʔ τͷ૊͕͋Δ

Slide 18

Slide 18 text

TCP/UDPͷ઀ଓΠϕϯτͷऩू 18 Host Kernel Process Process Transport … Tracer Polling ɾTracerϓϩηε͕LinuxΧʔωϧʹ໰͍߹Θ ͤɼTCP/UDPιέοτ৘ใΛϙʔϦϯάऔಘ ɾ઀ଓΛऴ୺͢ΔOSϓϩηε৘ใ΋͋Θͤͯ औಘ ɾιέοτ৘ใ: /proc/net/tcp΍Netlink sock_diag ɾϓϩηε৘ใ: /proc//{stat,fd} . . . ॲཧʹհೖ͠ͳ͍ͨΊ ௿Φʔόʔϔου

Slide 19

Slide 19 text

19 ܧଓతʹτϨʔγϯά͢Δલʹ ·ͣ͸ॠؒతʹ৘ใΛදࣔ͢Δ CLIπʔϧΛͭͬͨ͘

Slide 20

Slide 20 text

github.com/yuuki/lstf 20

Slide 21

Slide 21 text

lstf 21 $ lstf -n Local Address:Port <--> Peer Address:Port Connections 10.0.1.9:many --> 10.0.1.10:3306 22 10.0.1.9:many --> 10.0.1.11:3306 14 10.0.2.10:22 <-- 192.168.10.10:many 1 10.0.1.9:80 <-- 10.0.2.13:many 120 10.0.1.9:80 <-- 10.0.2.14:many 202

Slide 22

Slide 22 text

Ͳͷϓϩηεͱ௨৴͍ͯ͠Δ͔΋Θ͔Δ 22 $ lstf -n —process Local Address:Port <--> Peer Address:Port Connections Process 10.0.1.9:many --> 10.0.1.10:3306 22 {“mysqld”,pgid=6342} 10.0.1.9:many --> 10.0.1.11:3306 14 {“mysqld”,pgid=9398} 10.0.2.10:22 <-- 192.168.10.10:many 1 {“sshd”, pgid=27027} 10.0.1.9:80 <-- 10.0.2.13:many 120 {“unicorn”, pgid=3790} 10.0.1.9:80 <-- 10.0.2.14:many 202 {“unicorn”, pgid=3790}

Slide 23

Slide 23 text

23 Demo ISUCON4༧બ benchmarker web app db

Slide 24

Slide 24 text

24 ࣮૷ํ๏

Slide 25

Slide 25 text

࣮૷ͷϙΠϯτ 25 1. TCP઀ଓΠϕϯτΛͲ͏΍ͬͯऔಘ͢Δͷ͔ʁ 2. TCP઀ଓͷํ޲ΛͲ͏΍ͬͯೝࣝ͢Δͷ͔ʁ 3. TCP઀ଓΠϕϯτͷू໿ͱ͸ͳʹ͔ʁ 4. TCP઀ଓΠϕϯτͱϓϩηεΛͲ͏΍ͬͯඥ෇͚Δͷ͔ʁ

Slide 26

Slide 26 text

1. TCP઀ଓΠϕϯτΛͲ͏΍ͬͯऔಘ͢Δͷ͔ʁ 26 • /proc/net/tcp͔Βऔಘ • procfsͱ͍͏ϑΝΠϧγεςϜܗࣜͰΧʔωϧ͔Β৘ใΛͱΔ • https://github.com/shirou/gopsutil Λར༻ • Netlink API͔Βऔಘ • ιέοτܗࣜͰΧʔωϧ͔Β৘ใΛͱΔ ߴ଎ • Socket Monitoring Interface • github.com/elastic/gosigar/sys/linuxΛར༻

Slide 27

Slide 27 text

Procfs vs Netlink 27 • ໿40,000઀ଓ͋ΔWebαʔό্ʹͯɼlstfίϚϯυͷ࣮ߦ࣌ؒΛ໊લ ղܾ࣌ؒΛؚ·ͣʹൺֱ • EC2ͷc4.2xlargeɺDebian 8.10ɺLinuxΧʔωϧ3.16 • 500ms(procfs) => 300ms(netlink) ΁ • Netlink࣮૷ͷ΄͏͕1.6ഒ͸΍͍ https://memo.yuuk.io/entry/2018/06/18/003157

Slide 28

Slide 28 text

2. TCP઀ଓͷํ޲ΛͲ͏΍ͬͯࣝผ͢Δͷ͔ʁ 28 Host Y Port N Process B CONNECT Host X Port M Process A LISTEN ɾ઀ଓΛཁٻ͢ΔϗετY͸ɼ઀ଓΛड͚෇͚ΔϗετXʹґଘ͢Δ ɾϗετY͔ΒΈͯѼઌϙʔτ͕LISTENϙʔτMͰ͋Ε͹ɼHost Y ͔Β઀ଓΛཁٻ͍ͯ͠Δ͜ͱ͕Θ͔Δ ɾLISTENϙʔτ͸ɼϗετXͷOSʹ໰͍߹Θͤͯऔಘ͢Δ

Slide 29

Slide 29 text

3. TCP઀ଓΠϕϯτͷू໿ͱ͸ͳʹ͔ʁ 29 ɾΤϑΣϝϥϧϙʔτͷ৘ใ͸ґଘ೺Ѳʹ͸༨෼ͳͷͰू໿ ɾΤϑΣϝϥϧϙʔτ: Χʔωϧ͔ΒׂΓ౰ͯΒΕΔϥϯμϜͳૹ৴ݩ ϙʔτ ɾಛఆͷLISTENϙʔτ΁ෳ਺ͷΤϑΣϝϥϧϙʔτ͔Β઀ଓ͞ΕΔ ɾ͜ΕΒͷ઀ଓΛू໿͠ɼ1ݸͷ઀ଓͱΈͳ͢ Host Port Process Port Port Host Port Process 1ݸͷ઀ଓ
 ͱͯ͠ू໿ LISTEN ΤϑΣϝϥϧ

Slide 30

Slide 30 text

4. TCP઀ଓΠϕϯτͱϓϩηεͷඥ෇͚ 30 • procfsͱ͔netlink͔Βͷ઀ଓΠϕϯτʹϓϩηεͷ৘ใ͕ͳ͍ • ͔ΘΓʹ઀ଓΠϕϯτʹ͸ιέοτͷinode৘ใ͸͋Δ • /proc//fd ҎԼ͔Βinode৘ใ͸ͱΕΔ • ϓϩηεϦετͱɼ઀ଓΠϕϯτϦετͷ2ͭͷinodeΛΩʔʹ݁߹ • Nested Loop݁߹ https://memo.yuuk.io/entry/2019/linux-process-and-connection

Slide 31

Slide 31 text

31 skb->sk->socket->file->f_owner->pid

Slide 32

Slide 32 text

32 Ԡ༻

Slide 33

Slide 33 text

ؔ܎ੑʹج͍ͮͨpingϞχλϦϯά 33 ICMP/TCP/HTTP ݱࡏ ͜͏͍ͨ͠ ࣮ࡍͷ௨৴ܦ࿏Λ ؂ࢹ͍ͨ͠ ICMP/TCP/HTTP

Slide 34

Slide 34 text

lstfͰಈతʹ௨৴ઌΛࣝผ 34 • lstfͷΑ͏ͳػߏͳΒ௨৴ઌΛಈతʹऔಘͰ͖Δ • ෦෼ωοτϫʔΫো֐ൃੜ࣌ʹdeadmanͱ૊Έ߹ΘͤΔ • github.com/upa/deadman ͸pingͷTUI؂ࢹπʔϧ • deadman͸؂ࢹઌΛ੩తʹઃఆ͢ΔͷͰɼlstfͰಈతੜ੒͢Δ • ؂ࢹΤʔδΣϯτʹ૊ΈࠐΜͰৗ࣌؂ࢹ

Slide 35

Slide 35 text

35 Transtracer(WIP) github.com/yuuki/transtracer

Slide 36

Slide 36 text

γεςϜߏ੒ 36 Host 1 Host 2 Host N Postgres Tracer Tracer Tracer Systems Administrator ɾϗετ্ʹTracerΤʔδΣϯτΛ഑ஔ ɾ֤TracerΤʔδΣϯτ͸औಘͨ͠઀ଓ৘ ใΛPostgreSQLʹอଘ ɾγεςϜ؅ཧऀ͸PostgresʹΞΫηε ͠ɼෳ਺ͷϗετʹ·͕ͨΓґଘؔ܎Λ औಘ

Slide 37

Slide 37 text

37 ख๏ͷ੍໿

Slide 38

Slide 38 text

੍໿ 38 • TCP/UDP૚ͷ৘ใͳͷͰɼHTTPͳͲͷL7ϓϩτίϧͷ৘ใ͕ Θ͔Βͳ͍ • ϦΫΤετύε΍ϦΫΤετ୯ҐͷԠ౴଎౓ͳͲ • ผͷπʔϧͱ૊Έ߹ΘͤΔ͜ͱʹͳΔ • ϑΥϫʔυϓϩΩγ΍NATͷΑ͏ͳதܧऀΛڬΜͩͱ͖ͷґଘ ؔ܎Λ௥੻Ͱ͖ͳ͘ͳΔ • NAT: NATͷઌΛޓ͍ʹೝࣝͰ͖ͳ͍ • ϓϩΩγ: ࣮ࡍͷґଘΑΓ΋ଟ͘ͷґଘ͕͋ΔΑ͏ʹΈ͑Δ

Slide 39

Slide 39 text

39 ·ͱΊ

Slide 40

Slide 40 text

·ͱΊ 40 • ໰୊ҙࣝ: ෼ࢄγεςϜͷґଘؔ܎͕ෳࡶԽͯ͠มߋ଎౓͕௿Լ • ໨త: ؔ܎ੑʹண໨ͨ͠Obserbabilityͷ֬อ • ՝୊: ϚΠΫϩαʔϏεؒҎ֎ͷ௨৴ͷґଘΛ௥੻͢Δπʔϧ͸ ·ͩͳ͍ • ղܾ: LinuxͷTCP/UDP૚ͷ઀ଓΠϕϯτΛ࢖ͬͯɼܰྔτϨʔ γϯά

Slide 41

Slide 41 text

ࠓޙͷ՝୊ 41 • ϙʔϦϯάͱΠϕϯτ௨஌Λ૊Έ߹Θͤͯਫ਼౓޲্ • eBPFͰconnect(2)ͱaccept(2)ͰΠϕϯτ௨஌ • ίϯςφରԠ