Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Research Paper Introduction #41 “Andromeda: Per...

cafenero_777
November 12, 2022

Research Paper Introduction #41 “Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization”

cafenero_777

November 12, 2022
Tweet

More Decks by cafenero_777

Other Decks in Technology

Transcript

  1. Research Paper Introduction #41 “Andromeda: Performance, Isolation, and Velocity at

    Scale in Cloud Network Virtualization” ௨ࢉ#108 @cafenero_777 2022/11/10 1
  2. Agenda •ର৅࿦จ •֓ཁͱಡ΋͏ͱͨ͠ཧ༝ 1. Introduction 2. Overview 3. Control Plane

    4. VM Host Dataplane 5. Evaluation 6. Experience 7. Related Work 8. Conclusions 2
  3. ର৅࿦จ •Andromeda: Performance, Isolation, and Velocity at Scale in Cloud

    Network Virtualization • Michael Dalton, et.al. ૯੎24໊ at Google inc. • NSDI ‘18 • https://www.usenix.org/conference/nsdi18/presentation/dalton • Tech blog • https://cloud.google.com/blog/products/networking/google-cloud-networking-in- depth-how-andromeda-2-2-enables-high-throughput-vms 3
  4. ֓ཁͱಡ΋͏ͱͨ͠ཧ༝ •֓ཁ • GCPͷData Plane stackͷϑϩʔॲཧ • HoverboadϞσϧ (GWར༻)Ͱ਺ສVMن໛ͷNWΛ਺ඵͰσϓϩΠ •

    HVͰ͸kernel bypass & PMD౳Ͱߴ଎ॲཧ •ಡ΋͏ͱͨ͠ཧ༝ • վΊͯؾʹͳͬͨͷͰɻ • VMϦʔνఏڙͷ࠷଎Խ • HV಺NWػೳ (ACL, NATͳͲ)ͷϦιʔε΍ܭࢉྔͳͲͷෛՙ 4
  5. 1. Introduction 5 •Ϋϥ΢υͱNW • LB, FW, VPN, QoS, DoS

    protection, isolation, NAT, … • ߴ଎ͳσʔλϓϨʔϯ & ߴ଎ͳίϯτϩʔϧϓϨʔϯ͕ඞཁ •Andromeda C-plane: • ֊૚C-plane, ϗετԾ૝SW, ֦ுՄೳͳGW • +10ສVMʹରԠɺதԝ஋184msͰߋ৽ॲཧ׬ྃ •Andromeda D-plane: • Idol- fl ow͸Hoverboard (GW)Ͱॲཧ • Active- fl ow͸ϗετଆͰॲཧʢ+ FWͳͲϛυϧϘοΫεॲཧʣ • Live migrationΛར༻ͯ͠ɺແఀࢭͰD-planeΞοϓάϨʔυ •աڈ5೥Ͱεϧʔϓοτ19ഒɺCPUޮ཰Խ16ഒɺlatency7ഒɺଳҬ50ഒ
  6. 2. Overview •ཁ݅ͱઃܭΰʔϧ • NW isolationͭͭ͠ɺ಺෦αʔϏε͸࢖͑Δ • ֦ுػೳ: ՝ۚ, DoSରࡦɺtrace,

    ύϑΥʔϚϯε؂ࢹɺFW • ো֐ൣғͷ࠷খԽͱՄ༻ੑͷ࠷େԽ • L.M.͸Մ༻ੑͱvelocityʹͱͬͯඞਢػೳ • C-plane scalability: ਺ेສVM෼ͷઃఆͱσϓϩΠ (RIB/ fl ow/VMͷadd/del) •ઃܭ֓ཁ • C-plane: Cloud Cluster MGMT -> Andromeda (NWػೳ) • D-plane: Isolationʢੑೳ෼཭ɺػೳ෼཭ʣɺϓϩάϥϛϯά༰қ • Fast Path (ϗετॲཧ) 3Mpps (300ns/packet) • ͕͔͔࣌ؒͬͯ΋ྑ͍΋ͷ͸Hoverboards (GW)΍ίϓϩηοαʔ (statefulͳػೳ)ʹసૹ • ϗετ্ͷϧʔϧʹϚον͠ͳ͍΋ͷ͸Hoverboardsʹసૹ͞ΕΔɻ͜ΕʹΑΓαʔόͷϝϞϦ࢖༻ྔc-planeͷCPU࢖༻཰͕ҰܻԼ͕ͬͨ 6
  7. 3. Control Plane (1/3) CM, FM, SL •Cluster Manager (CM):

    ֤छϓϩϏδϣχϯά༻ • ࠓճͷ࿩ͷείʔϓ֎ •Fabric Manager (FM): Ծ૝NWʹؔ࿈͢ΔAPI܈ • VM-CʢVMϗετɺhoverboardʣɺLB-C (Maglev) • ࠩ෼update, Controller sharding • gRPC -> OpenFlow Front End -> OpenFlow • VMੜ੒ -> event -> gRPC -> … ͷΑ͏ͳύλʔϯ΋͋Δ •Switch Layer: OvS or Hoverboards • ϥούʔ࣮૷ͰOvSઃఆ౤ೖʢσόοάɺϔϧενΣοΫɺI/F࡞੒ɾ࡟আʣ • OvS֦ுϞδϡʔϧͰNWػೳ࣮૷ • conn-track FW, ՝ۚ, sticky-LB, token validation (spoo fi ng๷ࢭ), WANଳҬ੍ݶ • OvSͷ upcall()ϋϯυϥΛར༻ 7
  8. 3. Control Plane (2/3) SDN෮श •Scalable network programming: ਺ඦສVMΛઃఆ͍ͨ͠ •

    ैདྷ: ΞυϨεू໿΍෺ཧू໿ͳͲͰ޻෉ • SDN: ෺ཧɾԾ૝NWͷ෼཭ -> C-planeͷεέʔϦϯά͕ίετ •ϞσϧҊ • Preprogrammed Model: ࣄલʹ͢΂ͯͷ fl ow ruleΛdeploy͓ͯ͘͠ɻҰ෦Ͱ΋มΘΔͱશnodeͰ࠶౓deploy͕ඞཁ • On Demand Model: ඞཁͳͱ͖ʹdeploy͢ΔɻҰൃ໨ͷlatency͕ܹ஗ɻC-plane͕ࢭ·ΔͱऴΘΓɻRequest fl oodʹ੬ऑ • Gateway Model: গ਺ͷgateway nodeʹͷΈdeploy͢Ε͹ྑ͍ɻ࢖༻ྔϐʔΫʹ߹ΘͤͯϓϩϏδϣχϯά͕ඞཁ 8 ܦ࿏৘ใ
  9. 3. Control Plane (3/3) Hoverboard/Live Migration, Reliability •Hoverboard Model: On

    Demand + Gateway • ͱʹ͔͘Hoverboard GWʹ౤͛Δ HGW͸͢΂ͯΛ஌͍ͬͯΔ • ௨৴ϑϩʔͷେ෦෼͸͜Εܦ༝ • ௨৴ྔʹԠͯ͡o ff l oad (HV/VMʹ௚઀౤͛Δ)͢ΔΑ͏ʹ͢Δ •Live migration • L2Ͱ͸ͳ͘L3ͰϚΠάϨʔγϣϯͤ͞Δ • چHVʹύέοτ͕དྷͨΒϔΞϐϯసૹͤ͞Δ • ϧʔςΟϯάςʔϒϧͷҰ੪ߋ৽͸ݱ࣮తʹෆՄೳͷͨΊ • ϝϯςɾΞοϓσʔτɾVMͷ഑ஔ࠷దԽ͕༰қ • SR-IOVར༻࣮૷ͩͱ৭ʑେมɺιϑτ΢ΣΞ࣮૷ͷํָ͕ •Reliability • CMͷ্Ґ: Globally Aware CPͱRegionally Aware CPͰো֐ൣғΛ෼ׂ • Fail static: CP͔Β֎Εͯ΋ྑ޷ঢ়ଶͳΒαʔϏεܧଓ 9 GACP RACP
  10. 4. VM Host Dataplane (1/2) •શൠతͳ࿩ • DP=ϢʔβۭؒͰಈ࡞ɻNIC/VMύέοτॲཧ • Fast

    Path: fl ow rule. ͔͋ͨ΋Ωϟογϡͱͯ͠ಈ࡞ • Flow rule͕ແ͍৔߹͸ vswitchd (+Controller) ܦ༝Ͱసૹ͞ΕΔ // ݹ͖ྑ͖(?)Packet Inํࣜ • Busy poll (DPDK), 3Mpps͙Β͍ • Coprocessor: WANύέοτ҉߸ԽͳͲͷL7 (஗ԆɺCPUॲཧ) •Principles and Practices • ιϑτ΢ΣΞDP͸HWੑೳɾػೳͷࠩҟΛٵऩͯ͠ಁաతʹ࢖͑ΔʢSRIOVϚΠάϨɺςʔϒϧ਺ͳͲʣ • Fast PathͷػೳΛ࠷খݶɿ։ൃίετͱCPUόδΣοτͷ࠷খԽɻཁ͕݅ݫ͘͠ͳ͍ύέοτ͸ CoprocessorͰॲཧ • ࣄલܭࢉʹΑΓɺ fl ow ruleͷkeyΛݮΒ͓ͯ͘͠ • ϋΠύϑΥʔϚϯεͷbest practice (࣮૷)͸DPDKͱಉͨ͡Ίলུ 10
  11. 4. VM Host Dataplane (2/2) •Fast Path Flow Table •

    ͳΔ΂͘3tuple, VIP௨৴͚ͩ5tuple • FW/LB/NATػೳΛఏڙ • vswitchd/control-planeʹࣄલܭࢉࡁΈͷ fl owΛೖΕΔ • ྫɿFWͷruleΛશͯࣄલʹೖΕΔඞཁͳ͠ •Coprocessor Path • ࣮૷͸ൺֱతࣗ༝ • VMؒ҉߸Խ (4Gbps)ɺDoS๷ࢭ (ACL), ෆਖ਼๷ࢭ, WAN shaping • VM͝ͱʹ͋Δɻfairness/isolationͷͨΊɻ 11
  12. 5. Evaluation (DPฤ) •Andromeda࣮૷มભ • Pre-Andromeda: VMMͰͷύέοτIO͸UDPιέοτɻsingle Q vNIC, HW

    o ff l oadͳ͠ (LRO/TSOͳ͠) • 1.0: VMM pipelineվળɻOVS, Multi Q, Egress HW o ff l oad • 1.5: Ingress HW o ff l oad, VMM/OVS table, schedվળͰ஗Ԇվળ • 2.0: ΧʔωϧόΠύε + Busy poll௥ՃɻVMϝϞϦϚοϐϯάʢvirtio?) • 2.1: VMMͷόΠύε (IOMMU/vhost-user vring?ʣͰCPUޮ཰Խɺ஗Ԇେ෯࡟ݮ • 2.2: Intel QuickData DMA Enginesར༻Ͱlarge packet copyޮ཰Խ 12 ࢀߟ: https://www.redhat.com/en/blog/journey-vhost-users-realm ಉҰΫϥελVMؒ௨৴
  13. 5. Evaluation (CPฤ) •ϗόʔϘʔυϞσϧར༻: 100k VM·Ͱ֦ுՄೳ (Fig.11 a) • VM͕NW઀ଓͰ͖Δ·Ͱͷ࣌ؒͷվળʢVMC

    -> FM, HV OF ruleੜ੒ʣ • 50%ile: 511ms -> 184ms, 99%ile: 3.7s -> 576ms • Andromeda 1.0: ϓϦϓϩάϥϛϯάϞσϧ: 2k VM͕ݶք •Ϧιʔεଌఆ ʢ40kVMͷ৔߹ͷϝϞϦ࢖༻ྔʣ • ϓϦϓϩάϥϛϯάϞσϧɿ74ඵͰ487MϑϩʔΛੜ੒ɺ10GB/shard • ϗόʔϘʔυϞσϧɿ2ඵͰ1.5Mϑϩʔੜ੒ɺ512MB/shard 13 VMؒͷ •ϗόʔϘʔυ΁ͷΦϑϩʔυ (Fig.11 b, c) • ͖͍͠஋Λ20Kbpsʹ͢ΔͱCPεέʔϥϏϦςΟ͕50ഒ޲্ • ΦϑϩʔυʢVM௚௨৴ʣ͢Ε͹͢Δ΄ͲϗόʔϘʔυτϥϑΟοΫ͸ݮΔ • 50kϑϩʔ(શϑϩʔͷ0.1%ఔ౓)ΛΦϑϩʔυ͢ΔͱɺH.B.࢖༻཰͸ 1%·ͰԼ͕Δɻ
  14. 6. Experiments •1.0ͷΧʔωϧσʔλύε • ίϯτϩʔϧϓϨʔϯ࣮૷͚ͩͰʢOpenFlow APIͰʣॳظͷNWػೳఏڙͰ͖͍ͯͨ • OpenFlow͚ͩͰ͸ConntrackͰ͖ͳ͍ -> OpenFlow

    (OVS)Ͱ͸ͳ͍ϝΧχζϜʢ֎෦ϞδϡʔϧʣͰରԠ • ࢀߟɿVFPͳͲ͸vswitchଆʹconntrackϓϦϛςΟϒΛೖΕͨ •2.0ͷϢʔβεϖʔεσʔλύε • vswitchͷແఀࢭupdate, HV಺Ͱ৽چೖΕସ͑ɻஅ࣌ؒ͸270ms@50%ile • Ϣʔβۭؒϓϩηεͳvswitchͷݎ࿚ੑʢVMͷΈӨڹΛड͚Δɻkernel stackͩͱϗετશମ͕ӨڹΛड͚Δʣ •ϓϦϓϩάϥϛϯάϞσϧͰΦϯσϚϯυupdate࣮૷͕ͨ͠ɺ଴ͪߦྻΛॿ௕ͯ͠ഁ୼ͨ͠ • OpenFlow ruleͦͷ··ͩͱଟ͗͢ΔͷͰίϯύΫτͳදݱ͕ඞཁ • Reverse Path ForwardingνΣοΫͷͨΊػೳ֦ு • VMCฒྻԽ͢Δ΋ݶք -> ϗόʔϘʔυͰղܾ •ϗόʔϘʔυ͖͍͠஋໰୊ɿόον௨৴ͳͲͷେ༰ྔ௨৴ݕग़ͱΦϑϩʔυͷߴ଎Խ 14
  15. 7. Related Work •NVP: VMware੎ • pre-programmed model, VM nݸʹରͯ͠ৗʹϑϩʔ͕O(2n)ඞཁ

    • Ծ૝NWͱ෺ཧNWͰίϯτϩʔϥͷύʔςΟγϣϯ෼཭ •VFP: MS/Azure੎ • Stateful + SR-IOVͰ࣮૷ • CPUॲཧͳػೳΛHVͰ࣮૷͠ʹ͍͘ • HWΦϑϩʔυʹґଘʢAndromeda͸SW࣮૷ͰύΠϓϥΠϯͰεέʔϧͤ͞Δʣ 15
  16. 8. Conclusion •Andromeda: GCPͷԾ૝NWελοΫ • D-plane • ιϑτ΢ΣΞ࣮૷ʢOSόΠύεʣͰ32.8Gb/sୡ੒ • ίϓϩηοαͰߴػೳʢCPUॲཧͳNWػೳʣΛ࣮૷

    • C-plane • Մ༻ੑɺεέʔϥϏϦςΟ • HoverboardsϞσϧ • ແఀࢭupgrade, VMϥΠϒϚΠάϨʔγϣϯͷॏཁੑ 16
  17. Key takeaways •GCPͷ100k VM/clusterΛ࣮ݱͤ͞ΔԾ૝NWελοΫ (Andromeda)ͷ޻෉ •D-plane • ιϑτ΢ΣΞ࣮૷Ͱͷߴ଎Խ: kernel bypass

    & PMD౳Ͱߴ଎ॲཧ •C-plane • εέʔϥϏϦςΟͷ޻෉: HoverboardsϞσϧར༻ʢGWར༻ʣͰ਺ສ VMن໛ͷNWΛ਺ඵͰσϓϩΠ 17