Upgrade to Pro — share decks privately, control downloads, hide ads and more …

coil.pdf

Akihiro Ikezoe
February 19, 2019

 coil.pdf

Akihiro Ikezoe

February 19, 2019
Tweet

More Decks by Akihiro Ikezoe

Other Decks in Technology

Transcript

  1. େن໛KubernetesΫϥελ޲͚ʹ
    CNIϓϥάΠϯΛࣗ࡞ͨ͠࿩
    αΠϘ΢ζגࣜձࣾ
    ஑ఴ ໌޺
    1
    Kubernetes Meetup Tokyo #16
    2019/02/19

    View Slide

  2. ຊ೔ͷൃද಺༰
    • Πϯϑϥ࡮৽ϓϩδΣΫτ Neco ͷ঺հ
    • ͳͥCNIϓϥάΠϯΛࣗ࡞ͨ͠ͷ͔ʁ
    • ࣗ࡞CNIϓϥάΠϯ coil ͷ঺հ
    • CNIϓϥάΠϯ։ൃͰಘΒΕͨ஌ݟ
    • ·ͱΊ
    2

    View Slide

  3. Πϯϑϥ࡮৽ϓϩδΣΫτ
    Neco
    3

    View Slide

  4. Πϯϑϥ࡮৽ϓϩδΣΫτNecoͱ͸
    • Kubernetes ΍ Λಋೖͯ͠ ͷΠϯϑϥ
    Λ࡮৽͢ΔϓϩδΣΫτ
    • ͱ͸
    • kintone΍Garoon, OfficeͳͲͷαʔϏεΛSaaSͱͯ͠ఏڙ
    • ಋೖاۀ2.5ສࣾɺϢʔβʔ਺100ສਓ௒
    • 2011೥ϦϦʔε
    • VMϕʔεͷΞʔΩςΫνϟ
    • σʔληϯλʔΛआΓͯɺ1,000୆Ҏ্ͷαʔόʔΛࣗલͰӡ༻
    4

    View Slide

  5. Πϯϑϥ࡮৽ϓϩδΣΫτNecoͱ͸
    • ໨త
    • ϝϯςφϯείετͷେ෯࡟ݮ (NoOpsΛ໨ࢦ͢)
    • εέʔϥϏϦςΟͷ޲্
    • αʔόʔͷू໿ੑ޲্
    • ΞϓϦέʔγϣϯ։ൃνʔϜ͕σϓϩΠɾӡ༻ʹࢀՃ
    • ΄ͱΜͲͷ੒Ռ෺ΛOSSͱͯ͠ެ։
    5

    View Slide

  6. NecoͷΞʔΩςΫνϟ
    Kubernetes
    CoreOS
    Node
    LB Prometheus squid
    CoreOS
    Node
    CoreOS
    Node
    CoreOS
    Node
    CoreOS
    Node Boot Server
    CKE
    sabakan
    CoreDNS
    1,000୆ن໛ͷαʔόʔ
    ਺ઍʙͷΞϓϦέʔγϣϯίϯςφ
    ໿5୆ͷϒʔταʔόʔ
    Kubernetesͷ
    σϓϩΠ
    ؅ཧ
    ܧଓతσϦόϦʔ
    app app MySQL
    Elastic
    search
    6
    CoreOS
    Node
    neco
    updater
    Ubuntu
    Argo CD
    Rook

    View Slide

  7. NecoΛࢧ͑Διϑτ΢ΣΞͨͪ
    • sabakan
    • αʔόʔػࡐͷϥΠϑαΠΫϧ؅ཧͱϓϩϏδϣχϯάͷࣗಈԽΛ͓͜ͳ͏ɻ
    • BIOSͷઃఆɺOSͷωοτϒʔτɺσΟεΫ҉߸Խ΍֤छγεςϜιϑτ΢ΣΞͷ
    ηοτΞοϓΛࣗಈతʹ͓͜ͳ͏ɻ
    • CKE (Cybozu Container Engine)
    • KubernetesΫϥελͷࣗಈߏஙɾӡ༻Λ͓͜ͳ͏πʔϧɻ
    • sabakan͕ߏஙͨ͠αʔόʔʹKubernetesΛࣗಈతʹσϓϩΠ͢Δɻ
    • ΤϥʔΛࣗಈम෮ͨ͠ΓɺނোػࡐΛΫϥελ͔ΒऔΓ֎͢ͳͲͷӡ༻ΛࣗಈԽɻ
    • neco-updater
    • ΠϯϑϥͷܧଓతσϦόϦʔπʔϧɻ
    • GitHubͷϦϦʔε৘ใΛνΣοΫ͠ɺCKE΍sabakanΛ͸͡Ίͱ͢Δ֤छιϑτ
    ΢ΣΞͷσϓϩΠ΍ɺCoreOSΠϝʔδͷߋ৽ͳͲΛࣗಈతʹ࣮ࢪ͢Δɻ
    7

    View Slide

  8. ͳͥCNIϓϥάΠϯΛ
    ࣗ࡞ͨ͠ͷ͔ʁ
    8

    View Slide

  9. ͳͥCNIϓϥάΠϯΛࣗ࡞ͨ͠ͷ͔ʁ
    • NecoϓϩδΣΫτͷωοτϫʔΫߏ੒ʹϚονͨ͠ωοτϫʔΫϓ
    ϥάΠϯ͕ඞཁ
    • طଘͷϓϥάΠϯΛબఆ͕ͨ͠ɺ׬શʹཁ݅ʹϚον͢Δ΋ͷ͕ͳ
    ͔ͬͨɻ
    • CNIϓϥάΠϯ͸ෳ਺Λ૊Έ߹Θͤͯར༻͢Δ͜ͱ͕Մೳɻඞཁͳ෦
    ෼͚ͩࣗ࡞͠Α͏ɻ
    9

    View Slide

  10. NecoͷωοτϫʔΫߏ੒
    Rack1 Rack2 Rack3
    • CLOSΞʔΩςΫνϟ
    • ϑϥοτͳL3ωοτϫʔΫ
    • ֤ϊʔυʹಉҰϗοϓ਺Ͱ౸ୡՄೳ
    • East-WestτϥϑΟοΫͷ૿େʹର͠
    ͯεέʔϧՄೳ
    • BGPʹΑΔϧʔςΟϯά
    • AS per Rack
    • ECMPʹΑΔܦ࿏৑௕Խ
    • BFDʹΑΔߴ଎ͳܦ࿏ऩଋ
    • ৄ͘͠͸ϒϩάͰ
    • https://blog.cybozu.io/entry/2018/1
    1/01/113000
    10

    View Slide

  11. CNIϓϥάΠϯͷબఆ
    • σʔληϯλʔωοτϫʔΫͱ߹ΘͤͯɺKubernetesͷωοτϫʔΫʹ΋BGP
    Λ࠾༻ͯ͠ޮ཰తʹϧʔςΟϯάΛ͓͜ͳ͍͍ͨɻ
    • Calico
    • ։ൃ͕׆ൃͰػೳ΋๛෋ɻ࠾༻࣮੷΋ଟ͍ɻ
    • BGP SpeakerΛ಺แ͍ͯ͠Δ͜ͱ΍ɺେن໛ΫϥελͰ͸ܦ࿏਺͕૿େ͢Δ
    ͜ͱͳͲ͕ݒ೦ɻ
    • Romana
    • ػೳతʹ͸ཁ݅Λຬ͍ͨͯ͠Δɻ
    • etcd v3ະରԠɻ։ൃ͕׆ൃͰͳ͘࠷৽ͷKubernetesʹ௥ैͰ͖͍ͯͳ͍ɻ
    11

    View Slide

  12. ϓϥάΠϯͷ૊Έ߹Θͤ
    • ίϯςφωοτϫʔΫͰ͸ղܾ͢΂͖՝୊͕ଟ਺͋Δɻ
    • IPΞυϨε؅ཧ (IPAM)
    • ωοτϫʔΫ઀ଓੑ
    • ωοτϫʔΫϙϦγʔ
    • ଳҬ੍ݶ
    • ͢΂ͯͷ໰୊Λ1ͭͷϓϥάΠϯͰղܾ͢Δඞཁ͸ͳ͍ɻ
    • ෳ਺ͷϓϥάΠϯΛ૊Έ߹ΘͤΔྫ: Canal
    • IPAM͸CNIϓϥάΠϯͷඪ४ػೳΛར༻
    • ωοτϫʔΫ઀ଓੑ͸ flannel Λར༻
    • ωοτϫʔΫϙϦγʔ͸ Calico Λར༻
    12

    View Slide

  13. Ͳ͜Λࣗ࡞͢Δඞཁ͕͋Δͷ͔ʁ
    • ωοτϫʔΫϙϦγʔ
    • ࣮૷͕େมɻಠࣗੑΛग़͢ඞཁ͸ͳ͍ɻ
    ‎ Calico΍CiliumͳͲΛར༻
    • IPΞυϨε؅ཧ (IPAM)
    • ϧʔςΟϯάςʔϒϧͷ૿େΛආ͚ΔͨΊͷ࢓૊Έ͕ඞཁɻ
    • PodʹάϩʔόϧIPΞυϨεΛׂΓ౰ͯΔػೳ΋ཉ͍͠ɻ
    ‎ ࣗ࡞
    • ωοτϫʔΫ઀ଓ
    • ϊʔυͷ֎ͱͷܦ࿏ަ׵͸ɺDC಺ͷϧʔςΟϯάιϑτ΢ΣΞʹ೚ͤΔ
    ‎ ϊʔυ಺ͷϧʔςΟϯά෦෼Λࣗ࡞
    13

    View Slide

  14. ࣗ࡞CNIϓϥάΠϯ
    coil
    14

    View Slide

  15. coilͱ͸
    • αΠϘ΢ζ͕։ൃ͢ΔCNIϓϥάΠϯ
    • OSSͱͯ͠ެ։
    • https://github.com/cybozu-go/coil
    • ಛ௃
    • CNI Spec v0.3.1 (Kubernetes v1.13)ʹରԠ
    • GoݴޠͰ࣮૷ɺόοΫΤϯυʹ͸ etcd Λ࠾༻
    • Linux only, Kubernetes only, IPv4 only
    • IPAMͱϊʔυ಺ͷϧʔςΟϯάػೳͷΈΛఏڙ
    • ϧʔςΟϯάιϑτ΢ΣΞΛ಺แ͠ͳ͍
    • େن໛ΫϥελͰͷར༻Λߟྀ
    15

    View Slide

  16. େن໛ΫϥελͰར༻͢ΔͨΊʹ
    • ΦʔόϨΠωοτϫʔΫ͸ར༻͠ͳ͍
    • VXLANͳͲͷΦʔόʔϨΠωοτϫʔΫ͸εϧʔϓοτ͕མͪΔɻ
    ‎ ୯७ͳϧʔςΟϯάʹΑΓωοτϫʔΫͷ઀ଓੑΛཱ֬
    • Linux BridgeΛར༻͠ͳ͍
    • Linux BridgeΛ࢖͏ͱCPU࢖༻཰͕૿͑ͯ͠·͏ɻ
    ‎ ϧʔςΟϯάςʔϒϧʹveth΁ͷϧʔτΛ௥Ճ͢Δɻ
    • etcd API v3ͷΈʹରԠ
    • etcd API v2͸ɺଟ਺ͷΫϥΠΞϯτ͔Β઀ଓ͞Εͨ࣌ʹύϑΥʔϚϯε͕མͪΔɻ
    • ܦ࿏਺ͷ૿େΛ๷͙
    • Pod͝ͱʹܦ࿏Λ޿ࠂ͢Δͱܦ࿏਺͕૿େͯ͠͠·͏ɻ
    ‎ ΞυϨεϒϩοΫͱ͍͏࢓૊ΈʹΑΓɺαϒωοτ͝ͱͷܦ࿏Λ޿ࠂɻ
    16

    View Slide

  17. Node
    coilͷߏ੒ཁૉ
    • etcd
    • ׂΓ౰ͯͨΞυϨε৘ใΛ؅ཧ
    • coil-controller
    • k8s্ͷDeployment
    • ࢖ΘΕͳ͘ͳͬͨΞυϨεϒϩοΫͷղ์
    • coil:
    • CNIϓϥάΠϯຊମ
    • Pod΁ͷvethͷ௥Ճɾ࡟আɺIPΞυϨεׂΓ౰ͯɺ
    ϧʔςΟϯάઃఆͳͲΛ࣮ࢪ
    • coil-node: k8s্ͷDaemonSet
    • coild:
    • ϊʔυ͝ͱͷΞυϨε؅ཧ΍ϧʔςΟϯάͷઃఆΛ࣮ࢪ
    • coil-installer:
    • coil΍ઃఆϑΝΠϧͷΠϯετʔϧΛ࣮ࢪ
    17

    coil-node

    coil
    etcd
    House
    Keeping

    coil-controller
    coild
    coil
    installer
    conf
    ηοτΞοϓ

    View Slide

  18. ΞυϨεϒϩοΫ (Inspired by Romana)
    • coilͰ͸ΞυϨεϒϩοΫͱ͍͏࢓૊ΈΛ
    औΓೖΕɺαϒωοτ୯Ґ(ྫ: /28)Ͱܦ࿏
    Λ޿ࠂ͢Δ͜ͱͰϧʔςΟϯάςʔϒϧͷ
    ංେԽΛճආ͍ͯ͠Δɻ
    18
    Node2
    Pod
    10.0.1.16/32
    Node1
    Pod
    10.0.1.0/32
    Pod
    10.0.1.17/32
    Pod
    10.0.1.1/32
    BGP Router
    10.0.1.0/28 -> Node1
    10.0.1.16/28 -> Node2
    αϒωοτ୯ҐͰ
    ܦ࿏Λ޿ࠂ
    PodͷΞυϨεϨϯδ: 10.0.1.0/24
    ΞυϨεϒϩοΫ: 10.0.1.0/28
    ΞυϨεϒϩοΫ: 10.0.1.16/28
    ΞυϨεϒϩοΫ: 10.0.1.32/28
    ΞυϨεϒϩοΫ: 10.0.1.48/28
    ΞυϨεϒϩοΫ: 10.0.1.64/28
    ΞυϨεϒϩοΫ
    10.0.1.0/28
    ΞυϨεϒϩοΫ
    10.0.1.16/28
    ɿ
    ϊʔυ͝ͱʹ
    ΞυϨεϒϩοΫ
    ΛׂΓ౰ͯ
    ׂΓ౰ͯՄೳͳ
    ΞυϨεΛ
    ϒϩοΫͱ͍͏
    ୯Ґʹ෼ׂ
    PodͷΞυϨε
    ͸ΞυϨεϒ
    ϩοΫ͔Βׂ
    Γ౰ͯ

    View Slide

  19. coilͷॲཧͷྲྀΕ
    1. kubelet ͕ࢦࣔΛड͚ͯ Pod Λ࡞੒
    2. CNIϓϥάΠϯ coil Λ࣮ߦ
    3. coil ͔Β coild ʹIPΞυϨεΛཁٻ
    4. etcd ͔ΒΞυϨεϒϩοΫΛׂΓ౰ͯΔ
    5. ܦ࿏ΛϧʔςΟϯάςʔϒϧʹॻ͖ग़͢
    6. ϧʔςΟάςʔϒϧΛಡΈࠐΈܦ࿏Λ޿ࠂ
    7. coild ͔Β coil ʹIPΞυϨεΛฦ͢
    8. Pod ͷ netns ʹ veth ͷϖΞΛ࡞੒͠ɺIPΞυ
    ϨεͷׂΓ౰ͯͱϧʔτͷઃఆΛ͓͜ͳ͏
    Node

    coild

    coil
    kubelet
    Pod
    etcd
    routing
    table
    BGP
    Speaker
    eth0
    veth
    BGP
    Router










    19

    View Slide

  20. boot-taint
    • CNIϓϥάΠϯ͕ηοτΞοϓ͞Ε͍ͯͳ͍ϊʔυ
    ʹPod͕εέδϡʔϧ͞ΕΔͱࠔΔɻ
    • kubeletͷىಈΦϓγϣϯʹ —register-with-taints Λ
    ࢦఆͯ͠ɺىಈ௚ޙͷϊʔυʹ͸PodΛεέδϡʔ
    ϧͰ͖ͳ͍Α͏ʹ͓ͯ͘͠ɻ
    • coil ͷηοτΞοϓ͕׬ྃͨ͠Β taints Λ࡟আ͠ɺ
    PodΛεέδϡʔϦϯάՄೳʹ͢Δɻ
    20
    taints/tolerations
    Kubernetesͷػೳɻ
    ϊʔυʹtaintsΛ෇༩͢
    Δ͜ͱͰɺPodͷεέ
    δϡʔϦϯά΍࣮ߦΛ
    ېࢭ͢Δ͜ͱ͕Ͱ͖Δɻ
    ಛఆͷtolerations͕෇
    ༩͞ΕͨPodͷΈεέ
    δϡʔϦϯάՄೳͱͳ
    Δɻ

    View Slide

  21. CNIϓϥάΠϯ։ൃͰ
    ಘΒΕͨ஌ݟ
    21

    View Slide

  22. ։ൃ΍σόοά͕େมͰ͸ʁ
    • NecoϓϩδΣΫτͰ͸ɺσʔληϯλʔͷωοτϫʔΫߏ੒Λؙ͝ͱιϑτ
    ΢ΣΞͰԾ૝Խͨ͠؀ڥΛ༻ҙ͓ͯ͠ΓɺखݩͰ؆୯ʹಈ࡞֬ೝΛ͓͜ͳ͏͜ͱ
    ͕Ͱ͖Δɻ
    • γϯϓϧͳL3ωοτϫʔΫͳͷͰௐࠪ͠΍͍͢ɻ
    nsenterͱtcpdump͕͋Ε͹ɺେ఍ͷ໰୊͸ௐࠪͰ͖Δɻ
    22

    View Slide

  23. Kubernetesͷ৘ใΛಘΔʹ͸ʁ
    • CNIϓϥάΠϯ͸Kubernetesઐ༻ͷ΋ͷͰ͸ͳ͍ͨΊɺPodͷ৘ใΛऔಘ͢Δ࢓
    ༷͸ఆΊΒΕ͍ͯͳ͍ɻ
    • https://github.com/containernetworking/cni/issues/606
    • ݱঢ়͸Kubernetes͔ΒCNIϓϥάΠϯΛݺͼग़͢ࡍʹɺCNI_ARGS Ͱ
    K8S_POD_NAME ΍ K8S_POD_NAMESPACE ͳͲͷ৘ใΛ౉͍ͯ͠Δɻ
    • ໌֬ʹ࢓༷ͱͯ͠ఆ·͍ͬͯΔΘ͚Ͱ͸ͳ͍ͷͰ࣮૷ΛಡΈղ͘ඞཁ͕͋ͬͨɻ
    (dockershim΍containerdͳͲɺίϯςφϥϯλΠϜ͝ͱʹͦΕͧΕ࣮૷ͯ͠Δ)
    23

    View Slide

  24. CNIϓϥάΠϯ͕ಡΈࠐ·Εͳ͍
    • Kubernetes͸/etc/cni/net.dʹ͓͍ͯ͋ΔઃఆΛݩʹϓϥάΠϯΛ࣮ߦ͢Δɻ
    • ઃఆϑΝΠϧͷߋ৽௚ޙʹΞϓϦέʔγϣϯͷPodΛ࡞੒ͨ͠ͱ͜Ζɺઃఆͨ͠
    ϓϥάΠϯ͕ར༻͞Εͳ͍͜ͱ͕͋Δɻ
    • kubeletͰ͸ɺ5ඵ͝ͱʹίϯςφϥϯλΠϜͷεςʔλεߋ৽Λߦͳ͍ͬͯΔɻ
    ͜ͷͱ͖ʹCNIϓϥάΠϯ΋ಡΈࠐΜͰ͍Δɻ
    24

    View Slide

  25. Kubernetes 1.13ͰPodؒ௨৴͕Ͱ͖ͳ͍
    • KubernetesΛv1.13ʹΞοϓάϨʔυͨ͠ΒɺPodؒͷ௨৴͕Ͱ͖ͳ͘ͳͬͨɻ
    • kube-proxyͷIPVSϞʔυͷ࣮૷ͰɺLinuxͷΧʔωϧύϥϝʔλ(sysctl)ͷઃఆ͕
    มߋ͞Ε͍ͯͨ (ϦϦʔεϊʔτʹهड़ͳ͠)
    • net.ipv4.conf.all.arp_ignore: 0 -> 1
    • net.ipv4.conf.all.arp_announce: 0 -> 2
    • ͍͔ͭ͘ͷCNIϓϥάΠϯʹӨڹ͕ग़͍ͯΔ
    • https://github.com/kubernetes/kubernetes/issues/71555
    25

    View Slide

  26. ໰୊఺ͱղܾํ๏
    Node
    Pod ens3
    eth0 veth
    192.168.1.11
    IP: 10.0.1.1/32
    GW: 192.168.1.11
    ARP: 192.168.1.11
    Node
    Pod ens3
    eth0 veth
    192.168.1.11
    IP: 10.0.1.1/32
    GW: 169.254.1.1 169.254.1.1/32
    arp_ignore=0
    veth͕ens3ͷMACΞυϨεΛฦ͢
    arp_ignore=1
    veth͕ens3ͷMACΞυϨεΛฦ͞ͳ͍
    Pod͔ΒNodeʹͭͳ͕Βͳ͍ʂ
    vethʹϦϯΫϩʔΧϧΞυϨεΛׂΓ
    ౰ͯɺPodͷσϑΥϧτήʔτ΢ΣΠ
    Λ͜ͷΞυϨεʹมߋͨ͠ɻ
    मਖ਼લ मਖ਼ޙ
    26

    View Slide

  27. ·ͱΊͱࠓޙ
    27

    View Slide

  28. ·ͱΊͱࠓޙ
    • ·ͱΊ
    • େن໛ͳKubernetesΫϥελͰͷར༻Λߟྀͨ͠CLOSΞʔΩςΫνϟ޲͚
    CNIϓϥάΠϯ coil Λ։ൃͨ͠
    • coil ͷ࣮૷͸γϯϓϧͰಡΈ΍͍͢ͱࢥ͏ͷͰɺCNIϓϥάΠϯΛֶͼ͍ͨਓ
    ʹ΋Φεεϝ
    • ࠓޙ
    • େن໛Ϋϥελʹద༻ͯ҆͠ఆੑΛݕূ
    • νϡʔτϦΞϧͳͲΛ༻ҙͯ͠ར༻͠΍͘͢
    28

    View Slide

  29. We are hiring!
    • NecoϓϩδΣΫτͷ࠾༻৘ใ
    • https://cybozu.co.jp/company/job/recruitment/list/neco_project.html
    • εΩϧνΣοΫγʔτ
    • https://gist.github.com/ymmt2005/bd92296166e52d1beba9df8ac516a9db
    • NecoϓϩδΣΫτͰ਎ʹ͚ͭΒΕΔεΩϧΛ঺հ
    • ଟ༷ͳಇ͖ํ
    • ׬શϦϞʔτϫʔΫɺि20࣌ؒۈ຿ɺଞࣾͱͷ݉ۀͳͲɺ͍Ζ͍Ζͳಇ͖ํ
    Λ͍ͯ͠Δϝϯόʔ͕ॴଐ
    29

    View Slide