Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Higobashi.aws 第7回 AWS コンテナ実践勉強会 低レイヤー視点から読み解くコン...

Yuji Shimoda
October 24, 2018

Higobashi.aws 第7回 AWS コンテナ実践勉強会 低レイヤー視点から読み解くコンテナ入門

2018年10月25日に開催したHIGOBASHI.AWS ( https://classmethod.connpass.com/event/103495/ ) の発表資料です

Yuji Shimoda

October 24, 2018
Tweet

More Decks by Yuji Shimoda

Other Decks in Technology

Transcript

  1. %PDLFSͱ͸ʁ   Docker Inc. ͕ OSS Ͱ։ൃ͢ΔίϯςφϓϥοτϑΥʔϜ͓Αͼ
 ιϑτ΢ΣΞͷ૯শʢDocker Enterprise,Desktop,Hub

    ͳͲ͕͋Δʣ Linux ίϯςφΤϯδϯͷ̍ͭʢଞʹ rkt ΍ɺrunc ͳͲ͕͋Δʣ GoݴޠͰॻ͔Ε͓ͯΓɺLinux ͷػೳΛར༻͢ΔԾ૝Խιϑτ΢ΣΞ namespacesʢpid,net,ipc,mnt,uts ͳͲ໊લۭؒΛ෼཭͢Δػೳʣ cgroupsʢcontrol groups ͷུশͰɺϦιʔεΛ੍ݶ͢Δػೳʣ UnionFSʢϨΠϠΛ࡞੒͢Δ͜ͱʹΑͬͯಈ࡞͢ΔϑΝΠϧγεςϜʣ
  2. -JOVYΧʔωϧ͸ڞ௨ιϑτ΢ΣΞ   Linux ͸ڱٛʹ͸ɺΧʔωϧͰ͋Γ޿ٛʹ͸ OS Ͱ͋Δ LinuxσΟετϦϏϡʔγϣϯʢҎԼɺdistroʣͷར༻͕Ұൠత Amazon Linux

    Red Hat Enterprise Linux Ubuntu Desktop/Server etc… Linux Χʔωϧ͸ɺجຊతʹڞ௨ιϑτ΢ΣΞ ΞʔΩςΫνϟ͕ಉ͡ͳΒ͹ɺҟͳΔ distro ্Ͱಉ͡ΞϓϦ͕ಈ࡞Մ
 ʢABI ޓ׵ੑ͕͋Δ৔߹ʣ
  3. ήετ04ʢYʣͷϒʔτϓϩηε   ҎԼ͸ɺͬ͘͟Γͱͨ͠આ໌Ͱ͢ɻ BIOS POST or EFI ͷॳظԽޙɺϒʔτϩʔμʔʢGRUBͳͲʣΛىಈ ϦΞϧϞʔυ͔ΒϓϩςΫτϞʔυʹҠߦʢλεΫ΍อޢػೳͷ༗ޮԽ

    ϒʔτϩʔμʔ͕ΧʔωϧΛ෺ཧϝϞϦ্ʹల։ Χʔωϧىಈ४උʢϖʔδςʔϒϧ࡞੒ɺΧʔωϧελοΫॳظԽ౳ʣ ΧʔωϧىಈʢCPUʹՐΛೖΕͯɺ֤ϋʔυ΢ΣΞͷॳظԽ౳ʣ init ʢPID 1ʣϓϩηεΛੜ੒ ελʔτΞοϓεΫϦϓτΛىಈ͠ɺ֤छαʔϏεΛىಈͤ͞Δ ࢀߟɿXv6, a simple Unix-like teaching operating system
 https://pdos.csail.mit.edu/6.828/2012/xv6.html
  4. /FUXPSL/BNFTQBDFͱ͸ʁ   Linux ͷωοτϫʔΫۭؒΛԾ૝తʹ෼ׂ͢Δػೳ ωοτϫʔΫ໊લۭؒʢҎԼɺnetnsʣ͸ɺԼهͷΑ͏ͳ
 γεςϜϦιʔεͷ෼཭ػೳΛఏڙ͠ UNIX υϝΠϯΛִ཭͠·͢ɻ ωοτϫʔΫσΠόε

    IPv4/IPv6 ϓϩτίϧελοΫ IP ϧʔςΟϯάςʔϒϧ ϑΝΠΞ΢Υʔϧɾϧʔϧ /proc/net σΟϨΫτϦʢ/proc/PID/net ΁ͷγϯϘϦοΫϦϯΫʣ /sys/class/net σΟϨΫτϦ /proc/sys/net ഑Լͷ֤छϑΝΠϧ ϙʔτ൪߸ (sockets) ෺ཧωοτϫʔΫσόΠε͸ɺ1ͭͷ netns ʹॴଐͰ͖Δ Ծ૝ωοτϫʔΫσόΠεɾϖΞʢveth pairʣ͸ɺผͷ netns ʹ
 ॴଐ͢Δ෺ཧωοτϫʔΫσόΠε΁ͷϒϦοδΛ࡞੒͢ΔͨΊʹ
 ֤ netns ؒͷ L2 τϯωϧʢύΠϓͷΑ͏ͳந৅ԽʣػೳΛఏڙ 
 NETWORK_NAMESPACES(7) Linux Programmer's Manual 
 http://man7.org/linux/man-pages/man7/network_namespaces.7.html
  5. %PDLFSΠϯετʔϧલޙͰ֬ೝͯ͠Έͨ   $ brctl show bridge name bridge id

    STP enabled interfaces %PDLFSΠϯετʔϧલʢCSJEHFVUJMTΛΠϯετʔϧ͓ͯ͘͠ʣ %PDLFSΠϯετʔϧޙ $ brctl show bridge name bridge id STP enabled interfaces docker0 8000.024271457a61 no %PDLFSʢOHJOYʣίϯςφىಈޙ $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES efa1e165e8e7 nginx "nginx -g 'daemon of…" 2 days ago Up 2 days 0.0.0.0:8080->80/tcp elegant_cori $ brctl show bridge name bridge id STP enabled interfaces docker0 8000.024271457a61 no veth3fc628b
  6. %PDLFSΠϯετʔϧલޙͰ֬ೝͯ͠Έͨʢଓ͖ʣ   $ docker network inspect bridge | jq

    ".[].Containers" { "efa1e165e8e7fe4dc924d5da4ea6a5c083ba87064e2abb89ba1b71ba351bb40c": { "Name": "elegant_cori", "EndpointID": "e5c3325faeea2a57fc16f6835add9c99268849de38389264942f0ae36419180f", "MacAddress": “02:42:ac:11:00:02",ɹ★ίϯςφʹׂΓ౰ͯΒΕͨ eth0 ͷ MAC ΞυϨε "IPv4Address": “172.17.0.2/16",ɹɹɹ★ ίϯςφ eth0 ʹׂΓ౰ͯΒΕͨ IP ΞυϨε "IPv6Address": "" } } $ ip add show dev docker0 3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 02:42:71:45:7a:61 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0 ★ Linux Bridge(docker0) ͷ IP ΞυϨε : $ ping -I docker0 -c 1 172.17.0.2ɹ★ΠϯλʔϑΣΠε(docker0)Λࢦఆͯ͠ɺίϯςφͷ eth0 ʹ ping Λ࣮ߦ PING 172.17.0.2 (172.17.0.2) from 172.17.0.1 docker0: 56(84) bytes of data. 64 bytes from 172.17.0.2: icmp_seq=1 ttl=255 time=0.043 ms --- 172.17.0.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.043/0.043/0.043/0.000 ms $ arp -a ip-172-31-16-1.ap-northeast-1.compute.internal (172.31.16.1) at 06:e7:bc:91:cc:b2 [ether] on eth0 ip-172-17-0-2.ap-northeast-1.compute.internal (172.17.0.2) at 02:42:ac:11:00:02 [ether] on docker0ɹ★ ίϯςφͷ eth0 MAC Addr %PDLFSϗετʢ&$4Πϯελϯεʣ͔Βίϯςφ΁ͷૄ௨֬ೝ
  7. Ͳ͏΍ͬͯ%PDLFSϗετ֎෦ͱ௨৴͢Δͷʁ   $ sudo iptables -S -P INPUT ACCEPT

    -P FORWARD DROP -P OUTPUT ACCEPT -N DOCKER -N DOCKER-ISOLATION-STAGE-1 -N DOCKER-ISOLATION-STAGE-2 -N DOCKER-USER -A FORWARD -j DOCKER-USER -A FORWARD -j DOCKER-ISOLATION-STAGE-1 -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A FORWARD -o docker0 -j DOCKER -A FORWARD -i docker0 ! -o docker0 -j ACCEPT -A FORWARD -i docker0 -o docker0 -j ACCEPT -A DOCKER -d 172.17.0.2/32 ! -i docker0 -o docker0 -p tcp -m tcp --dport 80 -j ACCEPT ★Ѽઌ͕ίϯςφΞυϨεͷύέοτ͸సૹ -A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2 -A DOCKER-ISOLATION-STAGE-1 -j RETURN -A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP -A DOCKER-ISOLATION-STAGE-2 -j RETURN -A DOCKER-USER -j RETURN JQUBCMFTͷϧʔϧʹै͍ɺίϯςφѼͷύέοτΛసૹ͍ͯ͠Δ
  8. IPTUOFUXPSLʢΠϝʔδਤʣ   (MPCBM/BNFTQBDF ίϯςφͷ Network Namespace ͸
 ࡞੒ͤͣʹɺGlobal Namespace

    ಺Ͱ
 ίϯςφϓϩηεΛىಈ͢Δ͚ͩ bridge network ͱൺֱͨ͠৔߹ɺ
 ωʔϜεϖʔεؒͷ IP సૹॲཧ͕
 ෆཁͳͨΊΦʔόʔϔου͕গͳ͍ ͨͩ͠ɺListen Port ͕ڝ߹͢ΔͨΊ
 Port#80 Ͱ଴ͪड͚Δ Web ίϯςφ
 Λෳ਺ىಈ͢Δ͜ͱ͸ग़དྷͳ͍
  9. ίϯςφΛىಈͯ֬͠ೝͯ͠ΈΔ   $ docker run --rm -d --net=host nginx

    33f02f44760359b8c4a6ac3870ba1c5991977557daf95450988ab2fa41d852c3 $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 33f02f447603 nginx "nginx -g 'daemon of…" 4 seconds ago Up 4 seconds epic_liskov $ docker network inspect host | jq ".[].Containers" { "33f02f44760359b8c4a6ac3870ba1c5991977557daf95450988ab2fa41d852c3": { "Name": "epic_liskov", "EndpointID": "1d978b7bfa67fcd260ed830407d1371596b1bbac6ead9183a4e2f1e6345abe92", "MacAddress": "", "IPv4Address": "", "IPv6Address": "" } } $ sudo lsof -i:80 lsof: no pwd entry for UID 101 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME nginx 5157 root 6u IPv4 58496 0t0 TCP *:http (LISTEN) lsof: no pwd entry for UID 101 nginx 5183 101 6u IPv4 58496 0t0 TCP *:http (LISTEN) %PDLFSϗετͷ1PSUͰOHJOYϓϩηε͕-*45&/
  10. 5BTLOFUXPSLJOH BXTWQDOFUXPSLNPEF   amazon-ecs-agent/eni.md at master · aws/amazon-ecs-agent https://github.com/aws/amazon-ecs-agent/blob/master/proposals/eni.md

    AWS Fargate Ͱ།Ұαϙʔτ͞ΕΔ
 ωοτϫʔΫϞʔυ ٕज़తʹ͸ɺLinux ωοτϫʔΫ namespace ͱ ENI ͷ૊Έ߹Θͤ CNI(Container Networking Interface)ͷ࢓༷ʹج͖ͮɺ
 ECS CNI ϓϥάΠϯ͕։ൃ͞Ε͍ͯΔ aws/amazon-ecs-cni-plugins: Networking Plugins repository for ECS Task Networking 
 https://github.com/aws/amazon-ecs-cni-plugins
  11. ৄղɿ"NB[PO&$4ͷλεΫωοτϫʔΫ   ࢀߟࢿྉʣৄղ: Amazon ECSͷλεΫωοτϫʔΫ | Amazon Web Services

    ϒϩά 
 https://aws.amazon.com/jp/blogs/news/under-the-hood-task-networking-for-amazon-ecs/
  12. λεΫωοτϫʔΫʹΑΔϝϦοτɾσϝϦοτ   ϝϦοτ Bridge ωοτϫʔΫͷΑ͏ͳɺιϑτ΢ΣΞΦʔόʔϔου͕ແ͍ ίϯςφϓϩηε͸ɺωʔϜεϖʔεʹׂ౰ͯΒΕͨ ENI Λ௚઀ 


    LISTEN ͢ΔͨΊɺHost ωοτϫʔΫಉ౳ͱͳΔ λεΫʢίϯςφʣຖʹαʔϏεͰඞཁͱͳΔ࠷খݶͷϙʔτઃఆΛ
 ηΩϡϦςΟάϧʔϓʹߏ੒͢ΔͨΊ؅ཧ͕γϯϓϧʹͳΔ σϝϦοτ ECS ΠϯελϯεͷλΠϓʹΑΓɺίϯςφىಈՄೳ਺͕ҟͳΔ
 ʢEC2 ͷ੍ݶͱͯ͠ɺENI Λߏ੒Մೳͳ࠷େ਺͕λΠϓຖʹҟͳΔͨΊ
 Elastic Network Interface - Amazon Elastic Compute Cloud 
 https://docs.aws.amazon.com/ja_jp/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI
  13. λεΫωοτϫʔΫPO&$ߏஙखॱͷ͝঺հʢʣ   $ ip addr show | grep eth[0-1]

    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000 inet 172.31.21.5/20 brd 172.31.31.255 scope global eth0 10: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000 inet 172.31.64.128/24 brd 172.31.64.255 scope global eth1 લఏ৚݅ʣ༧ΊFUIΛΞλον͍ͯ͠Δ͜ͱ $ sudo -s # ip netns # ip netns add task # ip netns task ωοτϫʔΫωʔϜεϖʔεUBTLΛ࡞੒ # ip link set eth1 down # ip addr show dev eth1 10: eth1: <BROADCAST,MULTICAST> mtu 9001 qdisc pfifo_fast state DOWN group default qlen 1000 link/ether 06:8b:5f:7c:82:94 brd ff:ff:ff:ff:ff:ff inet 172.31.64.128/24 brd 172.31.64.255 scope global eth1 valid_lft forever preferred_lft forever # ip link set eth1 netns task # ip addr show dev eth1 Device "eth1" does not exist. FUIΛUBTLωʔϜεϖʔεʹׂΓ౰ͯΔ
  14. λεΫωοτϫʔΫPO&$ߏஙखॱͷ͝঺հʢʣ   # ip netns exec task ip addr

    show dev eth1 10: eth1: <BROADCAST,MULTICAST> mtu 9001 qdisc noop state DOWN group default qlen 1000 link/ether 06:8b:5f:7c:82:94 brd ff:ff:ff:ff:ff:ff # ip netns exec task ip link set eth1 up # ip netns exec task ip addr show dev eth1 10: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000 link/ether 06:8b:5f:7c:82:94 brd ff:ff:ff:ff:ff:ff inet6 fe80::48b:5fff:fe7c:8294/64 scope link valid_lft forever preferred_lft forever UBTLωʔϜεϖʔε಺ͷFUIΛMJOLVQͤ͞Δ # ip netns exec task ip addr add 172.31.64.128/24 dev eth1 # ip netns exec task ip addr show dev eth1 10: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000 link/ether 06:8b:5f:7c:82:94 brd ff:ff:ff:ff:ff:ff inet 172.31.64.128/24 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::48b:5fff:fe7c:8294/64 scope link valid_lft forever preferred_lft forever # ip netns exec task ip route 172.31.64.0/24 dev eth1 proto kernel scope link src 172.31.64.128 # ip netns exec task ip route add default via 172.31.64.1 dev eth1 # ip netns exec task ip route default via 172.31.64.1 dev eth1 172.31.64.0/24 dev eth1 proto kernel scope link src 172.31.64.128 FUIͷ*1͓ΑͼϧʔςΟϯάΛద੾ʹઃఆ
  15. ·ͱΊ   Bridge IP సૹॲཧʹ൐͏Φʔόʔϔου͸͋Δ΋ͷͷɺ໾ׂ͕ಉ͡ίϯςφΛ
 ू໿͢Δ৔߹͸બ୒͢Δඞཁ͋Γ Host ιϑτ΢ΣΞΦʔόʔϔου͸ແ͍͕ɺϙʔτ͕ڝ߹͢ΔͨΊ
 ໾ׂ͕ಉ͡ίϯςφΛू໿͢Δ͜ͱ͸Ͱ͖ͳ͍

    awsvpc Bridge/Host ωοτϫʔΫͷ֤՝୊Λղܾͨ͠ωοτϫʔΫߏ੒ λεΫωοτϫʔΫͰͷίϯςφىಈ਺͕ ENI ͷ੍ݶ஋ʢΠϯελϯε λΠϓຖͷ ENI ࠷େ਺ʣͱͳΔ͕ɺ஫ҙ఺Λ཈͑ͯੵۃతʹར༻͍ͨ͠