Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Higobashi.aws 第7回 AWS コンテナ実践勉強会 低レイヤー視点から読み解くコン...

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.
Avatar for Yuji Shimoda Yuji Shimoda
October 24, 2018

Higobashi.aws 第7回 AWS コンテナ実践勉強会 低レイヤー視点から読み解くコンテナ入門

2018年10月25日に開催したHIGOBASHI.AWS ( https://classmethod.connpass.com/event/103495/ ) の発表資料です

Avatar for Yuji Shimoda

Yuji Shimoda

October 24, 2018
Tweet

More Decks by Yuji Shimoda

Other Decks in Technology

Transcript

  1. %PDLFSͱ͸ʁ   Docker Inc. ͕ OSS Ͱ։ൃ͢ΔίϯςφϓϥοτϑΥʔϜ͓Αͼ
 ιϑτ΢ΣΞͷ૯শʢDocker Enterprise,Desktop,Hub

    ͳͲ͕͋Δʣ Linux ίϯςφΤϯδϯͷ̍ͭʢଞʹ rkt ΍ɺrunc ͳͲ͕͋Δʣ GoݴޠͰॻ͔Ε͓ͯΓɺLinux ͷػೳΛར༻͢ΔԾ૝Խιϑτ΢ΣΞ namespacesʢpid,net,ipc,mnt,uts ͳͲ໊લۭؒΛ෼཭͢Δػೳʣ cgroupsʢcontrol groups ͷུশͰɺϦιʔεΛ੍ݶ͢Δػೳʣ UnionFSʢϨΠϠΛ࡞੒͢Δ͜ͱʹΑͬͯಈ࡞͢ΔϑΝΠϧγεςϜʣ
  2. -JOVYΧʔωϧ͸ڞ௨ιϑτ΢ΣΞ   Linux ͸ڱٛʹ͸ɺΧʔωϧͰ͋Γ޿ٛʹ͸ OS Ͱ͋Δ LinuxσΟετϦϏϡʔγϣϯʢҎԼɺdistroʣͷར༻͕Ұൠత Amazon Linux

    Red Hat Enterprise Linux Ubuntu Desktop/Server etc… Linux Χʔωϧ͸ɺجຊతʹڞ௨ιϑτ΢ΣΞ ΞʔΩςΫνϟ͕ಉ͡ͳΒ͹ɺҟͳΔ distro ্Ͱಉ͡ΞϓϦ͕ಈ࡞Մ
 ʢABI ޓ׵ੑ͕͋Δ৔߹ʣ
  3. ήετ04ʢYʣͷϒʔτϓϩηε   ҎԼ͸ɺͬ͘͟Γͱͨ͠આ໌Ͱ͢ɻ BIOS POST or EFI ͷॳظԽޙɺϒʔτϩʔμʔʢGRUBͳͲʣΛىಈ ϦΞϧϞʔυ͔ΒϓϩςΫτϞʔυʹҠߦʢλεΫ΍อޢػೳͷ༗ޮԽ

    ϒʔτϩʔμʔ͕ΧʔωϧΛ෺ཧϝϞϦ্ʹల։ Χʔωϧىಈ४උʢϖʔδςʔϒϧ࡞੒ɺΧʔωϧελοΫॳظԽ౳ʣ ΧʔωϧىಈʢCPUʹՐΛೖΕͯɺ֤ϋʔυ΢ΣΞͷॳظԽ౳ʣ init ʢPID 1ʣϓϩηεΛੜ੒ ελʔτΞοϓεΫϦϓτΛىಈ͠ɺ֤छαʔϏεΛىಈͤ͞Δ ࢀߟɿXv6, a simple Unix-like teaching operating system
 https://pdos.csail.mit.edu/6.828/2012/xv6.html
  4. /FUXPSL/BNFTQBDFͱ͸ʁ   Linux ͷωοτϫʔΫۭؒΛԾ૝తʹ෼ׂ͢Δػೳ ωοτϫʔΫ໊લۭؒʢҎԼɺnetnsʣ͸ɺԼهͷΑ͏ͳ
 γεςϜϦιʔεͷ෼཭ػೳΛఏڙ͠ UNIX υϝΠϯΛִ཭͠·͢ɻ ωοτϫʔΫσΠόε

    IPv4/IPv6 ϓϩτίϧελοΫ IP ϧʔςΟϯάςʔϒϧ ϑΝΠΞ΢Υʔϧɾϧʔϧ /proc/net σΟϨΫτϦʢ/proc/PID/net ΁ͷγϯϘϦοΫϦϯΫʣ /sys/class/net σΟϨΫτϦ /proc/sys/net ഑Լͷ֤छϑΝΠϧ ϙʔτ൪߸ (sockets) ෺ཧωοτϫʔΫσόΠε͸ɺ1ͭͷ netns ʹॴଐͰ͖Δ Ծ૝ωοτϫʔΫσόΠεɾϖΞʢveth pairʣ͸ɺผͷ netns ʹ
 ॴଐ͢Δ෺ཧωοτϫʔΫσόΠε΁ͷϒϦοδΛ࡞੒͢ΔͨΊʹ
 ֤ netns ؒͷ L2 τϯωϧʢύΠϓͷΑ͏ͳந৅ԽʣػೳΛఏڙ 
 NETWORK_NAMESPACES(7) Linux Programmer's Manual 
 http://man7.org/linux/man-pages/man7/network_namespaces.7.html
  5. %PDLFSΠϯετʔϧલޙͰ֬ೝͯ͠Έͨ   $ brctl show bridge name bridge id

    STP enabled interfaces %PDLFSΠϯετʔϧલʢCSJEHFVUJMTΛΠϯετʔϧ͓ͯ͘͠ʣ %PDLFSΠϯετʔϧޙ $ brctl show bridge name bridge id STP enabled interfaces docker0 8000.024271457a61 no %PDLFSʢOHJOYʣίϯςφىಈޙ $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES efa1e165e8e7 nginx "nginx -g 'daemon of…" 2 days ago Up 2 days 0.0.0.0:8080->80/tcp elegant_cori $ brctl show bridge name bridge id STP enabled interfaces docker0 8000.024271457a61 no veth3fc628b
  6. %PDLFSΠϯετʔϧલޙͰ֬ೝͯ͠Έͨʢଓ͖ʣ   $ docker network inspect bridge | jq

    ".[].Containers" { "efa1e165e8e7fe4dc924d5da4ea6a5c083ba87064e2abb89ba1b71ba351bb40c": { "Name": "elegant_cori", "EndpointID": "e5c3325faeea2a57fc16f6835add9c99268849de38389264942f0ae36419180f", "MacAddress": “02:42:ac:11:00:02",ɹ★ίϯςφʹׂΓ౰ͯΒΕͨ eth0 ͷ MAC ΞυϨε "IPv4Address": “172.17.0.2/16",ɹɹɹ★ ίϯςφ eth0 ʹׂΓ౰ͯΒΕͨ IP ΞυϨε "IPv6Address": "" } } $ ip add show dev docker0 3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 02:42:71:45:7a:61 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0 ★ Linux Bridge(docker0) ͷ IP ΞυϨε : $ ping -I docker0 -c 1 172.17.0.2ɹ★ΠϯλʔϑΣΠε(docker0)Λࢦఆͯ͠ɺίϯςφͷ eth0 ʹ ping Λ࣮ߦ PING 172.17.0.2 (172.17.0.2) from 172.17.0.1 docker0: 56(84) bytes of data. 64 bytes from 172.17.0.2: icmp_seq=1 ttl=255 time=0.043 ms --- 172.17.0.2 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.043/0.043/0.043/0.000 ms $ arp -a ip-172-31-16-1.ap-northeast-1.compute.internal (172.31.16.1) at 06:e7:bc:91:cc:b2 [ether] on eth0 ip-172-17-0-2.ap-northeast-1.compute.internal (172.17.0.2) at 02:42:ac:11:00:02 [ether] on docker0ɹ★ ίϯςφͷ eth0 MAC Addr %PDLFSϗετʢ&$4Πϯελϯεʣ͔Βίϯςφ΁ͷૄ௨֬ೝ
  7. Ͳ͏΍ͬͯ%PDLFSϗετ֎෦ͱ௨৴͢Δͷʁ   $ sudo iptables -S -P INPUT ACCEPT

    -P FORWARD DROP -P OUTPUT ACCEPT -N DOCKER -N DOCKER-ISOLATION-STAGE-1 -N DOCKER-ISOLATION-STAGE-2 -N DOCKER-USER -A FORWARD -j DOCKER-USER -A FORWARD -j DOCKER-ISOLATION-STAGE-1 -A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT -A FORWARD -o docker0 -j DOCKER -A FORWARD -i docker0 ! -o docker0 -j ACCEPT -A FORWARD -i docker0 -o docker0 -j ACCEPT -A DOCKER -d 172.17.0.2/32 ! -i docker0 -o docker0 -p tcp -m tcp --dport 80 -j ACCEPT ★Ѽઌ͕ίϯςφΞυϨεͷύέοτ͸సૹ -A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2 -A DOCKER-ISOLATION-STAGE-1 -j RETURN -A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP -A DOCKER-ISOLATION-STAGE-2 -j RETURN -A DOCKER-USER -j RETURN JQUBCMFTͷϧʔϧʹै͍ɺίϯςφѼͷύέοτΛసૹ͍ͯ͠Δ
  8. IPTUOFUXPSLʢΠϝʔδਤʣ   (MPCBM/BNFTQBDF ίϯςφͷ Network Namespace ͸
 ࡞੒ͤͣʹɺGlobal Namespace

    ಺Ͱ
 ίϯςφϓϩηεΛىಈ͢Δ͚ͩ bridge network ͱൺֱͨ͠৔߹ɺ
 ωʔϜεϖʔεؒͷ IP సૹॲཧ͕
 ෆཁͳͨΊΦʔόʔϔου͕গͳ͍ ͨͩ͠ɺListen Port ͕ڝ߹͢ΔͨΊ
 Port#80 Ͱ଴ͪड͚Δ Web ίϯςφ
 Λෳ਺ىಈ͢Δ͜ͱ͸ग़དྷͳ͍
  9. ίϯςφΛىಈͯ֬͠ೝͯ͠ΈΔ   $ docker run --rm -d --net=host nginx

    33f02f44760359b8c4a6ac3870ba1c5991977557daf95450988ab2fa41d852c3 $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 33f02f447603 nginx "nginx -g 'daemon of…" 4 seconds ago Up 4 seconds epic_liskov $ docker network inspect host | jq ".[].Containers" { "33f02f44760359b8c4a6ac3870ba1c5991977557daf95450988ab2fa41d852c3": { "Name": "epic_liskov", "EndpointID": "1d978b7bfa67fcd260ed830407d1371596b1bbac6ead9183a4e2f1e6345abe92", "MacAddress": "", "IPv4Address": "", "IPv6Address": "" } } $ sudo lsof -i:80 lsof: no pwd entry for UID 101 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME nginx 5157 root 6u IPv4 58496 0t0 TCP *:http (LISTEN) lsof: no pwd entry for UID 101 nginx 5183 101 6u IPv4 58496 0t0 TCP *:http (LISTEN) %PDLFSϗετͷ1PSUͰOHJOYϓϩηε͕-*45&/
  10. 5BTLOFUXPSLJOH BXTWQDOFUXPSLNPEF   amazon-ecs-agent/eni.md at master · aws/amazon-ecs-agent https://github.com/aws/amazon-ecs-agent/blob/master/proposals/eni.md

    AWS Fargate Ͱ།Ұαϙʔτ͞ΕΔ
 ωοτϫʔΫϞʔυ ٕज़తʹ͸ɺLinux ωοτϫʔΫ namespace ͱ ENI ͷ૊Έ߹Θͤ CNI(Container Networking Interface)ͷ࢓༷ʹج͖ͮɺ
 ECS CNI ϓϥάΠϯ͕։ൃ͞Ε͍ͯΔ aws/amazon-ecs-cni-plugins: Networking Plugins repository for ECS Task Networking 
 https://github.com/aws/amazon-ecs-cni-plugins
  11. ৄղɿ"NB[PO&$4ͷλεΫωοτϫʔΫ   ࢀߟࢿྉʣৄղ: Amazon ECSͷλεΫωοτϫʔΫ | Amazon Web Services

    ϒϩά 
 https://aws.amazon.com/jp/blogs/news/under-the-hood-task-networking-for-amazon-ecs/
  12. λεΫωοτϫʔΫʹΑΔϝϦοτɾσϝϦοτ   ϝϦοτ Bridge ωοτϫʔΫͷΑ͏ͳɺιϑτ΢ΣΞΦʔόʔϔου͕ແ͍ ίϯςφϓϩηε͸ɺωʔϜεϖʔεʹׂ౰ͯΒΕͨ ENI Λ௚઀ 


    LISTEN ͢ΔͨΊɺHost ωοτϫʔΫಉ౳ͱͳΔ λεΫʢίϯςφʣຖʹαʔϏεͰඞཁͱͳΔ࠷খݶͷϙʔτઃఆΛ
 ηΩϡϦςΟάϧʔϓʹߏ੒͢ΔͨΊ؅ཧ͕γϯϓϧʹͳΔ σϝϦοτ ECS ΠϯελϯεͷλΠϓʹΑΓɺίϯςφىಈՄೳ਺͕ҟͳΔ
 ʢEC2 ͷ੍ݶͱͯ͠ɺENI Λߏ੒Մೳͳ࠷େ਺͕λΠϓຖʹҟͳΔͨΊ
 Elastic Network Interface - Amazon Elastic Compute Cloud 
 https://docs.aws.amazon.com/ja_jp/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI
  13. λεΫωοτϫʔΫPO&$ߏஙखॱͷ͝঺հʢʣ   $ ip addr show | grep eth[0-1]

    2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000 inet 172.31.21.5/20 brd 172.31.31.255 scope global eth0 10: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000 inet 172.31.64.128/24 brd 172.31.64.255 scope global eth1 લఏ৚݅ʣ༧ΊFUIΛΞλον͍ͯ͠Δ͜ͱ $ sudo -s # ip netns # ip netns add task # ip netns task ωοτϫʔΫωʔϜεϖʔεUBTLΛ࡞੒ # ip link set eth1 down # ip addr show dev eth1 10: eth1: <BROADCAST,MULTICAST> mtu 9001 qdisc pfifo_fast state DOWN group default qlen 1000 link/ether 06:8b:5f:7c:82:94 brd ff:ff:ff:ff:ff:ff inet 172.31.64.128/24 brd 172.31.64.255 scope global eth1 valid_lft forever preferred_lft forever # ip link set eth1 netns task # ip addr show dev eth1 Device "eth1" does not exist. FUIΛUBTLωʔϜεϖʔεʹׂΓ౰ͯΔ
  14. λεΫωοτϫʔΫPO&$ߏஙखॱͷ͝঺հʢʣ   # ip netns exec task ip addr

    show dev eth1 10: eth1: <BROADCAST,MULTICAST> mtu 9001 qdisc noop state DOWN group default qlen 1000 link/ether 06:8b:5f:7c:82:94 brd ff:ff:ff:ff:ff:ff # ip netns exec task ip link set eth1 up # ip netns exec task ip addr show dev eth1 10: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000 link/ether 06:8b:5f:7c:82:94 brd ff:ff:ff:ff:ff:ff inet6 fe80::48b:5fff:fe7c:8294/64 scope link valid_lft forever preferred_lft forever UBTLωʔϜεϖʔε಺ͷFUIΛMJOLVQͤ͞Δ # ip netns exec task ip addr add 172.31.64.128/24 dev eth1 # ip netns exec task ip addr show dev eth1 10: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000 link/ether 06:8b:5f:7c:82:94 brd ff:ff:ff:ff:ff:ff inet 172.31.64.128/24 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::48b:5fff:fe7c:8294/64 scope link valid_lft forever preferred_lft forever # ip netns exec task ip route 172.31.64.0/24 dev eth1 proto kernel scope link src 172.31.64.128 # ip netns exec task ip route add default via 172.31.64.1 dev eth1 # ip netns exec task ip route default via 172.31.64.1 dev eth1 172.31.64.0/24 dev eth1 proto kernel scope link src 172.31.64.128 FUIͷ*1͓ΑͼϧʔςΟϯάΛద੾ʹઃఆ
  15. ·ͱΊ   Bridge IP సૹॲཧʹ൐͏Φʔόʔϔου͸͋Δ΋ͷͷɺ໾ׂ͕ಉ͡ίϯςφΛ
 ू໿͢Δ৔߹͸બ୒͢Δඞཁ͋Γ Host ιϑτ΢ΣΞΦʔόʔϔου͸ແ͍͕ɺϙʔτ͕ڝ߹͢ΔͨΊ
 ໾ׂ͕ಉ͡ίϯςφΛू໿͢Δ͜ͱ͸Ͱ͖ͳ͍

    awsvpc Bridge/Host ωοτϫʔΫͷ֤՝୊Λղܾͨ͠ωοτϫʔΫߏ੒ λεΫωοτϫʔΫͰͷίϯςφىಈ਺͕ ENI ͷ੍ݶ஋ʢΠϯελϯε λΠϓຖͷ ENI ࠷େ਺ʣͱͳΔ͕ɺ஫ҙ఺Λ཈͑ͯੵۃతʹར༻͍ͨ͠