Slide 1

Slide 1 text

Research Paper Introduction #26 “Azure Accelerated Networking: SmartNICs in the Public Cloud” ௨ࢉ#81 @cafenero_777 2021/08/26 1

Slide 2

Slide 2 text

Agenda • ର৅࿦จ • ֓ཁͱಡ΋͏ͱͨ͠ཧ༝ 1. Introduction 2. Background 3. Design Goals and Rationale 4. SmartNIC Hardware Design 5. AccelNet System Design 6. Performance Results 7. Operationalization 8. Experiences 9. Related Work 10.Conclusion and Future Work 2

Slide 3

Slide 3 text

ର৅࿦จ • Azure Accelerated Networking: SmartNICs in the Public Cloud • Daniel Firestone Andrew Putnam Sambhrama Mundkur Derek Chiou Alireza Dabagh
 Mike Andrewartha Hari Angepat Vivek Bhanu Adrian Caul fi eld Eric Chung
 Harish Kumar Chandrappa Somesh Chaturmohta Matt Humphrey Jack Lavier Norman Lam Fengfen Liu Kalin Ovtcharov Jitu Padhye Gautham Popuri Shachar Raindel Tejas Sapre Mark Shaw Gabriel Silva Madhan Sivakumar Nisheeth Srivastava Anshuman Verma Qasim Zuhair Deepak Bansal Doug Burger Kushagra Vaid David A. Maltz Albert Greenberg • Microsoft • NSDI ‘18 • https://www.usenix.org/conference/nsdi18/presentation/ fi restone 3

Slide 4

Slide 4 text

֓ཁͱಡ΋͏ͱͨ͠ཧ༝ • ֓ཁ • Ϋϥ΢υͷNWػೳ૿Ճͱੑೳݒ೦ • FPGAϕʔεͷAccelNet (Azure Accelerated Network) • 2015-, 100ສ୆Ҏ্ɺ15us/32Gbps (VM-VM) • ಡ΋͏ͱͨ͠ཧ༝ͱײ૝ • SmartNICʹڵຯ͕͋ͬͨͨΊ • VFP, Catapultͱͷؔ܎ • গͣͭ͠ੲͷ࿦จΛಡΉ 4 https://www.microsoft.com/en-us/research/publication/con fi gurable-cloud-acceleration/ https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/ fi restone

Slide 5

Slide 5 text

AzureͷSDNͷྺ࢙ 5 લճͷ࿩ ࠓճͷ࿩ https://www.usenix.org/sites/default/ fi les/conference/protected- fi les/nsdi18_slides_ fi restone.pdf <- Catapult͸͜ͷลΓʁ

Slide 6

Slide 6 text

1. Introduction • ύϒϦοΫΫϥ΢υͷঢ়گ • ଳҬɺϨΠςϯγʔॏཁ • ඞཁͳػೳ͸SDNͰߦ͏ (L4LB/ACL/vRouting/metering/QoS on HV) • 1GbE -> 40GbE+ • SR-IOV࢖͏ʁ->SDNελοΫΛόΠύεͯ͠͠·͏ɻNIC (HW)ͰશͯͷSDNΛ࣮૷ʁʂ • FPGAϕʔεͷAzure SmartNIC • ϗετSDNελοΫ: Azure Accelerated Networking (AccelNet) • VFPͱڠௐಈ࡞ 6

Slide 7

Slide 7 text

2. Background • ϗετSDNͰ࣮૷ͨ͠ʢVFPʣ • େن໛ɾෳࡶɾසൟʹߋ৽͞ΕΔ->HW SwitchͰ͸”࣮ݱෆՄೳ” • ϗετଆͰωοτϫʔΫॲཧͷऑ఺ • HV্ͷιϑτ΢ΣΞʢvSwitchʣͰॲཧ -> ඇԾ૝؀ڥͱൺֱͯ͠௿଎ɾߴ஗Ԇ ʢ+ΏΒ͗ʣɺߴCPU࢖༻཰ • SR-IOVΛ࢖͏ͱʁ • PF͸ϗετ༻ɺVFΛ֤VMʹ௚઀౉ͨ͢Ίɺੑೳ͸෺ཧͷ··ɻ • ͨͩ͠ɺͦͷ··Ͱ͸ϗετ্ͷVFP΋όΠύεͯ͠͠·͏ • GFT (Generic Flow Table) HW Φϑϩʔυ • VFPͷ࢓૊ΈʢUF/HTʣͰmatch/action -> GFTʹίϐʔ -> GFT HW(NIC)Ͱॲཧ • ͍ΘΏΔOvS-TC likeͳSmartNIC HWΦϑϩʔυॲཧ 7

Slide 8

Slide 8 text

3. Design Goals and Rationale 8 • CPU͸࢖Θͳ͍: IaaSϏδωεͳͷͰίετʹ௚݁ʢ1core 900$/year, 4500$/3-5yearʣ • VFPͷϓϩάϥϚϏϦςΟ͸ҡ͍࣋ͨ͠: ͢΂ͯΛΦϑϩʔυ͢Δඞཁ͸ͳ͍ʢྫɿ௨৴தͷϙϦγʔมߋʣ • SR-IOVͷ׆༻Ͱ௿஗Ԇɾ޿ଳҬɿGFTͰ࣮ݱ • SDN৽ػೳΛαϙʔτɿઃܭϩοΫΠϯΛܯռɻશͯͷطଘ؀ڥʹ΋৽ػೳΛಋೖ͍ͨ͠ • γϯάϧίωΫγϣϯͰߴύϑΥʔϚϯεɿCPUͰ͸ෆՄೳɻ෼ࢄ͸ΞϓϦมߋඞཁ • 100GbE+΁ͷ֦ுɿଳҬ΍VM͕૿͑ͯ΋αϙʔτͰ͖ΔΑ͏ʹɻ • อकੑʢϑϩʔঢ়ଶʣͷҡ࣋ɿϥΠϒϚΠάϨʔγϣϯ΍ϝϯςφϯεରԠͰ͖ΔΑ͏ʹɻ

Slide 9

Slide 9 text

̐ɽSmartNIC Hardware Design (1/3) Hardware Option 9 • NW ASICϕϯμʔͱڠྗͯ͠ઃܭҊΛ࡞͕ͬͨɺNIC͸ొ৔ͤͣɻɻ • 1st packet match͠ͳ͍ͱϗετଆ΁సૹ͞ΕΔઃܭ͕ड͚ೖΕΒΕͳ͔ͬͨɺΒ͍͠ • > ASICϕʔε • λΠϜϥάʢASIC׬੒·Ͱ1-2೥ɻαʔόݮ٫·Ͱ5೥ʣ΍ϕϯμʔϑΝʔϜ΢ΣΞఏڙ • > ϚϧνίΞSoC NIC • ίʔυͦͷ··ॻ͚Δɻ10G͸ྑ͔͕ͬͨ40GʹͳΔͱίΞεέδϡʔϥͰ஗ԆൃੜɻίΞ਺ര૿Ͱ100G+Ͱిྗɾίετෆ҆ • > DPDK • ύέοτॲཧίετ͸େ෯ʹԼ͛ΒΕΔ͕ɺCPUίΞͷίετ૿ɻϚϧνίΞSoCΑΓ஗͍ɻ • > FPGA • ϓϩάϥϚϏϦςΟΛҡ࣋ͭͭ͠ɺHWੑೳʢόε෯ɾετϨʔδ+ύΠϓϥΠϯॲཧʣΛ׆͔ͤͦ͏ɻCatapultͷ࣮੷΋͋Δ

Slide 10

Slide 10 text

̐ɽSmartNIC Hardware Design (2/3) Evaluating FPGA as SmartNICs 10 • ϩδοΫαΠζ͕ASICͷ10ഒ? -> ࣮ࡍ͸2-3ഒ • FPGA: SRAM, I/O + ΧελϜϩδοΫ૿, ASIC: ϓϩάϥϚϒϧϩδοΫ૿ • FPGAߴՁʁ -> Azureن໛ͳΒϖΠͰ͖Δ • FPGAϓϩάϥϛϯά೉͍͠ʁ • ޮ཰తͳύΠϓϥΠϯઃܭ࣌ʹͷΈߴੑೳΛൃشͰ͖Δ • CatapultνʔϜ͕ࢧԉɻAccelNetνʔϜ͸5ਓɻHW/SWڠௐઃܭϞσϧʢྫɿΞδϟΠϧ։ൃʣ • FPGA͸εέʔϧ͢Δ͔ʁ • CatapultνʔϜ͕ղܾࡁΈʢྫɿshellػߏʣ • Ҡ২ੑ͸ʁ • System VerilogͰॻ͍͍ͯΔɻҠ২ੑΛߟ͔͚͑ͯ͹Մೳɻ࣮ࡍXilinx -> Alteraʹมߋͨ͠ɻ SDNνʔϜʮFPGA࢖ͬͨ͜ͱແ͍͚Ͳɺຊ౰ʹ࢖͑Δͷʁʯ

Slide 11

Slide 11 text

̐ɽSmartNIC Hardware Design (3/3) SmartNIC System Architecture 11 • SR-IOV, RDMAΛ׬શ࣮૷͸ࠔ೉ • ·ͣ͸SDNΦϑϩʔυػೳͷΈ࣮૷ • Bump-in-the-Wireߏ੒ʹ૬৐Γ • (a) 2015-, 40G (Altera), (b) OCP 50G (intel) FPGA w/ NIC O ffl oadॲཧ SRIOVॲཧ VFPॲཧ

Slide 12

Slide 12 text

5. AccelNet System Design 12 • ιϑτ΢ΣΞઃܭ • Exception packet (≒1st packet)͸tagͬͯHost/VFP΁ • FPGAͰTCPऴ୺ʢSYN/RST/FINʣΛݕ஌͢ΔͱύέοτΛVFP΁౤͛ɺVFPʹTCP stateΛ௥੻ͤ͞Δ • FPGAύΠϓϥΠϯઃܭ • Match/Action, action͸recon fi gurable. Straits VD5 1/3 logic areaΛ࢖༻ • fl ow tracking, reconciliation • ack൪߸΍counter͸PCIeܦ༝ͰVFPʹಉظ • ϙϦγʔߋ৽ͱੈ୅؅ཧ΋VFPͱಉظ

Slide 13

Slide 13 text

6. Performance Results 13 • 40Gbps Gen1 SmartNIC + Xeon E5-2673v4, on same region Clos NW appੑೳͰ̎ഒ single connͰୡ੒ ޿ଳҬ ௿஗Ԇ ௿ϨΠςϯγ • ଞࣾൺֱʢ2017೥11݄ݱࡏʣ

Slide 14

Slide 14 text

7. Operationalization • ϝϯς΍live migrationͷͨΊɺঢ়ଶΛอ࣋ɾ෮ؼ͍ͤͨ͞ • VFΛ௚઀ݟͤͨ͘ͳ͍ • netvsc (Virtual Service Consumer)͕”଍”Λ੾Γସ͑ • transparent bonding (kernelʹίϛοτ) • DPDK PMDͰ௚઀͔ͭΉ৔߹͸ʁ • failsafe PMDΛ࣮૷ (dpdkʹίϛοτ) • RDMA • ʢҰൠతʹ͸Կ͔͋ΔͱʣTCPʹϑΥʔϧόοΫɻࠓͷͱ͜Ζ໰୊ʹ͸ͳ͍ͬͯͳ͍ • AccelNetελοΫͷߋ৽ϝϯςͰ্ه͕͏·͘ػೳ͍ͯ͠Δ • ҰॠͷjitterͰ΋໰୊ʹͳΔVM͸ɺϝλσʔλ෇༩Ͱࣄલmigration͔LBدͤ • ؂ࢹ • ϝτϦΫεɺFPGA্ͷશI/FͰpacket capture/trace/samplingͰ͖Δ 14

Slide 15

Slide 15 text

• CPU͸࢖Θͳ͍: IaaSϏδωεͳͷͰίετʹ௚݁ʢ1core 900$/year, 4500$/3-5yearʣ • σʔλύεʹCPU࢖Θͳ͍ɻྫ֎ॲཧͰͷ࢖༻཰͸1%ະຬ • VFPͷϓϩάϥϚϏϦςΟ͸ҡ͍࣋ͨ͠: ͢΂ͯΛΦϑϩʔυ͢Δඞཁ͸ͳ͍ʢྫɿ௨৴தͷϙϦγʔมߋʣ • VFPΛͦͷ··ར༻ • SR-IOVͰ௿஗Ԇɾ޿ଳҬɿGFTͰ࣮ݱ • FPGAૠೖ஗Ԇ͸<1us, ϥΠϯϨʔτୡ੒ • SDN৽ػೳΛαϙʔτɿઃܭϩοΫΠϯΛܯռɻશͯͷطଘ؀ڥʹ΋৽ػೳΛಋೖ͍ͨ͠ • FPGA্ͷ৽͍͠ػೳʢactionͳͲʣΛ௥Ճɺߋ৽ɻશαʔόͰ࣮ߦɻ • γϯάϧίωΫγϣϯͰߴύϑΥʔϚϯεɿCPUͰ͸ෆՄೳɻ෼ࢄ͸ΞϓϦมߋඞཁ • Ͱ͖ͨ • 100GbE+΁ͷ֦ுɿଳҬ΍VM͕૿͑ͯ΋αϙʔτͰ͖ΔΑ͏ʹɻ • Ͱ͖ͦ͏ • อकੑʢϑϩʔঢ়ଶʣͷҡ࣋ɿϥΠϒϚΠάϨʔγϣϯ΍ϝϯςφϯεରԠͰ͖ΔΑ͏ʹɻ • production؀ڥͰԿ೥΋΍͍ͬͯΔ 8. Experiences (1/3) ໨ඪୡ੒ͨ͠ʁ->ͨ͠ 15

Slide 16

Slide 16 text

• FPGA͸data center ready? • શͯͷΫϥ΢υ؀ڥͰbestͳιϦϡʔγϣϯͱ͸ݴ͑ͳ͍ • ΤίγεςϜ׬੒͢Δ·Ͱ͸େن໛Ϋϥ΢υϕϯμʔҎ֎ʹ͸޲͔ͳ͍ • HWνʔϜ(Catapult)ͷαϙʔτͳͲ͕͋Ε͹ྑ͍ɺ͕ɻɻɻ • FPGAͰ͸ग़དྷͨʢASICͰ͸Ͱ͖ͳ͍Ͱ͋Ζ͏ʣܧଓతͳ࣮૷มߋ • ಠࣗΧϓηϧԽ/OverlayɺNAT46, ύϑΥʔϚϯε޲্ (hashing/table insertion) 8. Experiences (2/3) FPGAશൠʹ͍ͭͯ 16

Slide 17

Slide 17 text

• HW/SW/VMͷ౷߹͞ΕͨαʔϏεʢྫɿmigrationʣ΍؂ࢹɻޙ෇͸ෆՄೳ • HW/SW͸ಉ͡νʔϜॴଐʹ͢Δ • ιϑτ΢ΣΞςΫχοΫΛFPGA։ൃʹ࢖͏ • VFPͱFPGAΛผѻ͍ͰCI/CD͢ΔɻՄೳͳݶΓHWϩδοΫΛιϑτ΢ΣΞతʹѻ͏ • ݫີͳRTLݕূͰ͸ͳ͘ຊ൪؀ڥͰͷϝτϦΫεௐࠪɾγφϦΦ࣮ߦ • ύϑΥʔϚϯε޲্=৴པੑ޲্ɻϗετଆϦιʔεىҼͷҰ࣌తͳෆௐΛഉআ • HW/SW͸܁Γฦ͠վળॏཁɻ࣮ࡍͷϫʔΫϩʔυ͔ΒσʔλΛऩू͠վળΛ܁Γฦ͢ • hashing/caching͸ϦϦʔεޙʹԿ౓΋มߋͨ͠ • Ұ൪ނো͢Δͷ͸FGPA্ͷDRAM. ނো཰ࣗମ͸ଞͷύʔπͱมΘΒͣ • ্ҐϨΠϠʔʢVFPʣͰ͸ந৅Խ͢ΔɻΦϑϩʔυͱ͸෼͚ͯߟ͑Δɻ • CPUΛ࢖Θͳ͍ͷͰɺMeltdown/Spectre mitigationʹΑΔNWύϑΥʔϚϯεྼԽΛେ෯௿ݮ 8. Experiences (3/3) ֶΜͩ͜ͱɺڭ܇ 17 ղઆऀʹࢗͬͨ͞ ϙΠϯτൈਮ

Slide 18

Slide 18 text

• 2015೥Ҏ߱ɺࢢ৔ʹSmartNIC͕ग़࢝ΊΔ • ॊೈੑʹ͚ܽΔʢactionͷ֦ுͳͲʣɺCPU࢖͏ʢMetronome AjiloʣͨΊ 100G+͸ݫ͍͠ʁ • εΠονASICͰLB͢Δ->HostͰTCP௥͑ͳ͍ɻϙϦγʔશͯΛASICʹࡌͤ ΒΕͳ͍ɻ • P4Τϯδϯ: GFTͱࣅͨΞϓϩʔνɻ֦ுੑ΋͋ΔɻHWԽ͞Εͦ͏ɻ 9. Related Work 18

Slide 19

Slide 19 text

10. Conclusion and Future Work • FPGAϕʔεͷAzure SmartNIC • VFPͱڠௐಈ࡞ͯ͠ωοτϫʔΫͷߴ଎ԽʢΦϑϩʔυʣ • ઃܭ΍ӡ༻Ͱͷ஌ݟ • શHVʹϓϩάϥϚϒϧNIC͕ࡌΔͱશ͘৽͍͠ػೳ͕࣮ݱ͢Δ͔΋ʁ 19

Slide 20

Slide 20 text

ࡾߦ·ͱΊ • AzureͰ͸ϗετଆͷSDNιϑτ΢ΣΞʢAzure VFPʣͱڠௐಈ࡞͢ΔFPGA ϕʔεͷSmart NICΛ࣮૷ • 1st packet (ྫ֎ύέοτ)ͷΈϗετଆͰSWॲཧ(action)͠ɺͦͷaction಺༰ΛFPGAʹίϐʔͯ͠Ҏޙͷॲཧ͸HW (FPGA)Ͱॲཧ • 40Gbps؀ڥͰ୯Ұ௨৴(VM-VM௨৴)Ͱ32Gbps, 15usΛୡ੒ • Azureن໛ͳΒFPGAͰͷ৽ػೳ։ൃʢ৽encapํࣜ౳ʣ΍scaleՄೳ 20

Slide 21

Slide 21 text

EoP 21