Slide 1

Slide 1 text

Research Paper Introduction #41 “Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization” ௨ࢉ#108 @cafenero_777 2022/11/10 1

Slide 2

Slide 2 text

Agenda •ର৅࿦จ •֓ཁͱಡ΋͏ͱͨ͠ཧ༝ 1. Introduction 2. Overview 3. Control Plane 4. VM Host Dataplane 5. Evaluation 6. Experience 7. Related Work 8. Conclusions 2

Slide 3

Slide 3 text

ର৅࿦จ •Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization • Michael Dalton, et.al. ૯੎24໊ at Google inc. • NSDI ‘18 • https://www.usenix.org/conference/nsdi18/presentation/dalton • Tech blog • https://cloud.google.com/blog/products/networking/google-cloud-networking-in- depth-how-andromeda-2-2-enables-high-throughput-vms 3

Slide 4

Slide 4 text

֓ཁͱಡ΋͏ͱͨ͠ཧ༝ •֓ཁ • GCPͷData Plane stackͷϑϩʔॲཧ • HoverboadϞσϧ (GWར༻)Ͱ਺ສVMن໛ͷNWΛ਺ඵͰσϓϩΠ • HVͰ͸kernel bypass & PMD౳Ͱߴ଎ॲཧ •ಡ΋͏ͱͨ͠ཧ༝ • վΊͯؾʹͳͬͨͷͰɻ • VMϦʔνఏڙͷ࠷଎Խ • HV಺NWػೳ (ACL, NATͳͲ)ͷϦιʔε΍ܭࢉྔͳͲͷෛՙ 4

Slide 5

Slide 5 text

1. Introduction 5 •Ϋϥ΢υͱNW • LB, FW, VPN, QoS, DoS protection, isolation, NAT, … • ߴ଎ͳσʔλϓϨʔϯ & ߴ଎ͳίϯτϩʔϧϓϨʔϯ͕ඞཁ •Andromeda C-plane: • ֊૚C-plane, ϗετԾ૝SW, ֦ுՄೳͳGW • +10ສVMʹରԠɺதԝ஋184msͰߋ৽ॲཧ׬ྃ •Andromeda D-plane: • Idol- fl ow͸Hoverboard (GW)Ͱॲཧ • Active- fl ow͸ϗετଆͰॲཧʢ+ FWͳͲϛυϧϘοΫεॲཧʣ • Live migrationΛར༻ͯ͠ɺແఀࢭͰD-planeΞοϓάϨʔυ •աڈ5೥Ͱεϧʔϓοτ19ഒɺCPUޮ཰Խ16ഒɺlatency7ഒɺଳҬ50ഒ

Slide 6

Slide 6 text

2. Overview •ཁ݅ͱઃܭΰʔϧ • NW isolationͭͭ͠ɺ಺෦αʔϏε͸࢖͑Δ • ֦ுػೳ: ՝ۚ, DoSରࡦɺtrace, ύϑΥʔϚϯε؂ࢹɺFW • ো֐ൣғͷ࠷খԽͱՄ༻ੑͷ࠷େԽ • L.M.͸Մ༻ੑͱvelocityʹͱͬͯඞਢػೳ • C-plane scalability: ਺ेສVM෼ͷઃఆͱσϓϩΠ (RIB/ fl ow/VMͷadd/del) •ઃܭ֓ཁ • C-plane: Cloud Cluster MGMT -> Andromeda (NWػೳ) • D-plane: Isolationʢੑೳ෼཭ɺػೳ෼཭ʣɺϓϩάϥϛϯά༰қ • Fast Path (ϗετॲཧ) 3Mpps (300ns/packet) • ͕͔͔࣌ؒͬͯ΋ྑ͍΋ͷ͸Hoverboards (GW)΍ίϓϩηοαʔ (statefulͳػೳ)ʹసૹ • ϗετ্ͷϧʔϧʹϚον͠ͳ͍΋ͷ͸Hoverboardsʹసૹ͞ΕΔɻ͜ΕʹΑΓαʔόͷϝϞϦ࢖༻ྔc-planeͷCPU࢖༻཰͕ҰܻԼ͕ͬͨ 6

Slide 7

Slide 7 text

3. Control Plane (1/3) CM, FM, SL •Cluster Manager (CM): ֤छϓϩϏδϣχϯά༻ • ࠓճͷ࿩ͷείʔϓ֎ •Fabric Manager (FM): Ծ૝NWʹؔ࿈͢ΔAPI܈ • VM-CʢVMϗετɺhoverboardʣɺLB-C (Maglev) • ࠩ෼update, Controller sharding • gRPC -> OpenFlow Front End -> OpenFlow • VMੜ੒ -> event -> gRPC -> … ͷΑ͏ͳύλʔϯ΋͋Δ •Switch Layer: OvS or Hoverboards • ϥούʔ࣮૷ͰOvSઃఆ౤ೖʢσόοάɺϔϧενΣοΫɺI/F࡞੒ɾ࡟আʣ • OvS֦ுϞδϡʔϧͰNWػೳ࣮૷ • conn-track FW, ՝ۚ, sticky-LB, token validation (spoo fi ng๷ࢭ), WANଳҬ੍ݶ • OvSͷ upcall()ϋϯυϥΛར༻ 7

Slide 8

Slide 8 text

3. Control Plane (2/3) SDN෮श •Scalable network programming: ਺ඦສVMΛઃఆ͍ͨ͠ • ैདྷ: ΞυϨεू໿΍෺ཧू໿ͳͲͰ޻෉ • SDN: ෺ཧɾԾ૝NWͷ෼཭ -> C-planeͷεέʔϦϯά͕ίετ •ϞσϧҊ • Preprogrammed Model: ࣄલʹ͢΂ͯͷ fl ow ruleΛdeploy͓ͯ͘͠ɻҰ෦Ͱ΋มΘΔͱશnodeͰ࠶౓deploy͕ඞཁ • On Demand Model: ඞཁͳͱ͖ʹdeploy͢ΔɻҰൃ໨ͷlatency͕ܹ஗ɻC-plane͕ࢭ·ΔͱऴΘΓɻRequest fl oodʹ੬ऑ • Gateway Model: গ਺ͷgateway nodeʹͷΈdeploy͢Ε͹ྑ͍ɻ࢖༻ྔϐʔΫʹ߹ΘͤͯϓϩϏδϣχϯά͕ඞཁ 8 ܦ࿏৘ใ

Slide 9

Slide 9 text

3. Control Plane (3/3) Hoverboard/Live Migration, Reliability •Hoverboard Model: On Demand + Gateway • ͱʹ͔͘Hoverboard GWʹ౤͛Δ HGW͸͢΂ͯΛ஌͍ͬͯΔ • ௨৴ϑϩʔͷେ෦෼͸͜Εܦ༝ • ௨৴ྔʹԠͯ͡o ff l oad (HV/VMʹ௚઀౤͛Δ)͢ΔΑ͏ʹ͢Δ •Live migration • L2Ͱ͸ͳ͘L3ͰϚΠάϨʔγϣϯͤ͞Δ • چHVʹύέοτ͕དྷͨΒϔΞϐϯసૹͤ͞Δ • ϧʔςΟϯάςʔϒϧͷҰ੪ߋ৽͸ݱ࣮తʹෆՄೳͷͨΊ • ϝϯςɾΞοϓσʔτɾVMͷ഑ஔ࠷దԽ͕༰қ • SR-IOVར༻࣮૷ͩͱ৭ʑେมɺιϑτ΢ΣΞ࣮૷ͷํָ͕ •Reliability • CMͷ্Ґ: Globally Aware CPͱRegionally Aware CPͰো֐ൣғΛ෼ׂ • Fail static: CP͔Β֎Εͯ΋ྑ޷ঢ়ଶͳΒαʔϏεܧଓ 9 GACP RACP

Slide 10

Slide 10 text

4. VM Host Dataplane (1/2) •શൠతͳ࿩ • DP=ϢʔβۭؒͰಈ࡞ɻNIC/VMύέοτॲཧ • Fast Path: fl ow rule. ͔͋ͨ΋Ωϟογϡͱͯ͠ಈ࡞ • Flow rule͕ແ͍৔߹͸ vswitchd (+Controller) ܦ༝Ͱసૹ͞ΕΔ // ݹ͖ྑ͖(?)Packet Inํࣜ • Busy poll (DPDK), 3Mpps͙Β͍ • Coprocessor: WANύέοτ҉߸ԽͳͲͷL7 (஗ԆɺCPUॲཧ) •Principles and Practices • ιϑτ΢ΣΞDP͸HWੑೳɾػೳͷࠩҟΛٵऩͯ͠ಁաతʹ࢖͑ΔʢSRIOVϚΠάϨɺςʔϒϧ਺ͳͲʣ • Fast PathͷػೳΛ࠷খݶɿ։ൃίετͱCPUόδΣοτͷ࠷খԽɻཁ͕݅ݫ͘͠ͳ͍ύέοτ͸ CoprocessorͰॲཧ • ࣄલܭࢉʹΑΓɺ fl ow ruleͷkeyΛݮΒ͓ͯ͘͠ • ϋΠύϑΥʔϚϯεͷbest practice (࣮૷)͸DPDKͱಉͨ͡Ίলུ 10

Slide 11

Slide 11 text

4. VM Host Dataplane (2/2) •Fast Path Flow Table • ͳΔ΂͘3tuple, VIP௨৴͚ͩ5tuple • FW/LB/NATػೳΛఏڙ • vswitchd/control-planeʹࣄલܭࢉࡁΈͷ fl owΛೖΕΔ • ྫɿFWͷruleΛશͯࣄલʹೖΕΔඞཁͳ͠ •Coprocessor Path • ࣮૷͸ൺֱతࣗ༝ • VMؒ҉߸Խ (4Gbps)ɺDoS๷ࢭ (ACL), ෆਖ਼๷ࢭ, WAN shaping • VM͝ͱʹ͋Δɻfairness/isolationͷͨΊɻ 11

Slide 12

Slide 12 text

5. Evaluation (DPฤ) •Andromeda࣮૷มભ • Pre-Andromeda: VMMͰͷύέοτIO͸UDPιέοτɻsingle Q vNIC, HW o ff l oadͳ͠ (LRO/TSOͳ͠) • 1.0: VMM pipelineվળɻOVS, Multi Q, Egress HW o ff l oad • 1.5: Ingress HW o ff l oad, VMM/OVS table, schedվળͰ஗Ԇվળ • 2.0: ΧʔωϧόΠύε + Busy poll௥ՃɻVMϝϞϦϚοϐϯάʢvirtio?) • 2.1: VMMͷόΠύε (IOMMU/vhost-user vring?ʣͰCPUޮ཰Խɺ஗Ԇେ෯࡟ݮ • 2.2: Intel QuickData DMA Enginesར༻Ͱlarge packet copyޮ཰Խ 12 ࢀߟ: https://www.redhat.com/en/blog/journey-vhost-users-realm ಉҰΫϥελVMؒ௨৴

Slide 13

Slide 13 text

5. Evaluation (CPฤ) •ϗόʔϘʔυϞσϧར༻: 100k VM·Ͱ֦ுՄೳ (Fig.11 a) • VM͕NW઀ଓͰ͖Δ·Ͱͷ࣌ؒͷվળʢVMC -> FM, HV OF ruleੜ੒ʣ • 50%ile: 511ms -> 184ms, 99%ile: 3.7s -> 576ms • Andromeda 1.0: ϓϦϓϩάϥϛϯάϞσϧ: 2k VM͕ݶք •Ϧιʔεଌఆ ʢ40kVMͷ৔߹ͷϝϞϦ࢖༻ྔʣ • ϓϦϓϩάϥϛϯάϞσϧɿ74ඵͰ487MϑϩʔΛੜ੒ɺ10GB/shard • ϗόʔϘʔυϞσϧɿ2ඵͰ1.5Mϑϩʔੜ੒ɺ512MB/shard 13 VMؒͷ •ϗόʔϘʔυ΁ͷΦϑϩʔυ (Fig.11 b, c) • ͖͍͠஋Λ20Kbpsʹ͢ΔͱCPεέʔϥϏϦςΟ͕50ഒ޲্ • ΦϑϩʔυʢVM௚௨৴ʣ͢Ε͹͢Δ΄ͲϗόʔϘʔυτϥϑΟοΫ͸ݮΔ • 50kϑϩʔ(શϑϩʔͷ0.1%ఔ౓)ΛΦϑϩʔυ͢ΔͱɺH.B.࢖༻཰͸ 1%·ͰԼ͕Δɻ

Slide 14

Slide 14 text

6. Experiments •1.0ͷΧʔωϧσʔλύε • ίϯτϩʔϧϓϨʔϯ࣮૷͚ͩͰʢOpenFlow APIͰʣॳظͷNWػೳఏڙͰ͖͍ͯͨ • OpenFlow͚ͩͰ͸ConntrackͰ͖ͳ͍ -> OpenFlow (OVS)Ͱ͸ͳ͍ϝΧχζϜʢ֎෦ϞδϡʔϧʣͰରԠ • ࢀߟɿVFPͳͲ͸vswitchଆʹconntrackϓϦϛςΟϒΛೖΕͨ •2.0ͷϢʔβεϖʔεσʔλύε • vswitchͷແఀࢭupdate, HV಺Ͱ৽چೖΕସ͑ɻஅ࣌ؒ͸270ms@50%ile • Ϣʔβۭؒϓϩηεͳvswitchͷݎ࿚ੑʢVMͷΈӨڹΛड͚Δɻkernel stackͩͱϗετશମ͕ӨڹΛड͚Δʣ •ϓϦϓϩάϥϛϯάϞσϧͰΦϯσϚϯυupdate࣮૷͕ͨ͠ɺ଴ͪߦྻΛॿ௕ͯ͠ഁ୼ͨ͠ • OpenFlow ruleͦͷ··ͩͱଟ͗͢ΔͷͰίϯύΫτͳදݱ͕ඞཁ • Reverse Path ForwardingνΣοΫͷͨΊػೳ֦ு • VMCฒྻԽ͢Δ΋ݶք -> ϗόʔϘʔυͰղܾ •ϗόʔϘʔυ͖͍͠஋໰୊ɿόον௨৴ͳͲͷେ༰ྔ௨৴ݕग़ͱΦϑϩʔυͷߴ଎Խ 14

Slide 15

Slide 15 text

7. Related Work •NVP: VMware੎ • pre-programmed model, VM nݸʹରͯ͠ৗʹϑϩʔ͕O(2n)ඞཁ • Ծ૝NWͱ෺ཧNWͰίϯτϩʔϥͷύʔςΟγϣϯ෼཭ •VFP: MS/Azure੎ • Stateful + SR-IOVͰ࣮૷ • CPUॲཧͳػೳΛHVͰ࣮૷͠ʹ͍͘ • HWΦϑϩʔυʹґଘʢAndromeda͸SW࣮૷ͰύΠϓϥΠϯͰεέʔϧͤ͞Δʣ 15

Slide 16

Slide 16 text

8. Conclusion •Andromeda: GCPͷԾ૝NWελοΫ • D-plane • ιϑτ΢ΣΞ࣮૷ʢOSόΠύεʣͰ32.8Gb/sୡ੒ • ίϓϩηοαͰߴػೳʢCPUॲཧͳNWػೳʣΛ࣮૷ • C-plane • Մ༻ੑɺεέʔϥϏϦςΟ • HoverboardsϞσϧ • ແఀࢭupgrade, VMϥΠϒϚΠάϨʔγϣϯͷॏཁੑ 16

Slide 17

Slide 17 text

Key takeaways •GCPͷ100k VM/clusterΛ࣮ݱͤ͞ΔԾ૝NWελοΫ (Andromeda)ͷ޻෉ •D-plane • ιϑτ΢ΣΞ࣮૷Ͱͷߴ଎Խ: kernel bypass & PMD౳Ͱߴ଎ॲཧ •C-plane • εέʔϥϏϦςΟͷ޻෉: HoverboardsϞσϧར༻ʢGWར༻ʣͰ਺ສ VMن໛ͷNWΛ਺ඵͰσϓϩΠ 17

Slide 18

Slide 18 text

EoP 18