Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Research Paper Introduction #26 “Azure Accelera...

Research Paper Introduction #26 “Azure Accelerated Networking: SmartNICs in the Public Cloud”

Azure Accelerated Networking: SmartNICs in the Public Cloud
NSDI ‘18
https://www.usenix.org/conference/nsdi18/presentation/firestone

cafenero_777

January 02, 2023
Tweet

More Decks by cafenero_777

Other Decks in Technology

Transcript

  1. Research Paper Introduction #26 “Azure Accelerated Networking: SmartNICs in the

    Public Cloud” ௨ࢉ#81 @cafenero_777 2021/08/26 1
  2. Agenda • ର৅࿦จ • ֓ཁͱಡ΋͏ͱͨ͠ཧ༝ 1. Introduction 2. Background 3.

    Design Goals and Rationale 4. SmartNIC Hardware Design 5. AccelNet System Design 6. Performance Results 7. Operationalization 8. Experiences 9. Related Work 10.Conclusion and Future Work 2
  3. ର৅࿦จ • Azure Accelerated Networking: SmartNICs in the Public Cloud

    • Daniel Firestone Andrew Putnam Sambhrama Mundkur Derek Chiou Alireza Dabagh
 Mike Andrewartha Hari Angepat Vivek Bhanu Adrian Caul fi eld Eric Chung
 Harish Kumar Chandrappa Somesh Chaturmohta Matt Humphrey Jack Lavier Norman Lam Fengfen Liu Kalin Ovtcharov Jitu Padhye Gautham Popuri Shachar Raindel Tejas Sapre Mark Shaw Gabriel Silva Madhan Sivakumar Nisheeth Srivastava Anshuman Verma Qasim Zuhair Deepak Bansal Doug Burger Kushagra Vaid David A. Maltz Albert Greenberg • Microsoft • NSDI ‘18 • https://www.usenix.org/conference/nsdi18/presentation/ fi restone 3
  4. ֓ཁͱಡ΋͏ͱͨ͠ཧ༝ • ֓ཁ • Ϋϥ΢υͷNWػೳ૿Ճͱੑೳݒ೦ • FPGAϕʔεͷAccelNet (Azure Accelerated Network)

    • 2015-, 100ສ୆Ҏ্ɺ15us/32Gbps (VM-VM) • ಡ΋͏ͱͨ͠ཧ༝ͱײ૝ • SmartNICʹڵຯ͕͋ͬͨͨΊ • VFP, Catapultͱͷؔ܎ • গͣͭ͠ੲͷ࿦จΛಡΉ 4 https://www.microsoft.com/en-us/research/publication/con fi gurable-cloud-acceleration/ https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/ fi restone
  5. 1. Introduction • ύϒϦοΫΫϥ΢υͷঢ়گ • ଳҬɺϨΠςϯγʔॏཁ • ඞཁͳػೳ͸SDNͰߦ͏ (L4LB/ACL/vRouting/metering/QoS on

    HV) • 1GbE -> 40GbE+ • SR-IOV࢖͏ʁ->SDNελοΫΛόΠύεͯ͠͠·͏ɻNIC (HW)ͰશͯͷSDNΛ࣮૷ʁʂ • FPGAϕʔεͷAzure SmartNIC • ϗετSDNελοΫ: Azure Accelerated Networking (AccelNet) • VFPͱڠௐಈ࡞ 6
  6. 2. Background • ϗετSDNͰ࣮૷ͨ͠ʢVFPʣ • େن໛ɾෳࡶɾසൟʹߋ৽͞ΕΔ->HW SwitchͰ͸”࣮ݱෆՄೳ” • ϗετଆͰωοτϫʔΫॲཧͷऑ఺ •

    HV্ͷιϑτ΢ΣΞʢvSwitchʣͰॲཧ -> ඇԾ૝؀ڥͱൺֱͯ͠௿଎ɾߴ஗Ԇ ʢ+ΏΒ͗ʣɺߴCPU࢖༻཰ • SR-IOVΛ࢖͏ͱʁ • PF͸ϗετ༻ɺVFΛ֤VMʹ௚઀౉ͨ͢Ίɺੑೳ͸෺ཧͷ··ɻ • ͨͩ͠ɺͦͷ··Ͱ͸ϗετ্ͷVFP΋όΠύεͯ͠͠·͏ • GFT (Generic Flow Table) HW Φϑϩʔυ • VFPͷ࢓૊ΈʢUF/HTʣͰmatch/action -> GFTʹίϐʔ -> GFT HW(NIC)Ͱॲཧ • ͍ΘΏΔOvS-TC likeͳSmartNIC HWΦϑϩʔυॲཧ 7
  7. 3. Design Goals and Rationale 8 • CPU͸࢖Θͳ͍: IaaSϏδωεͳͷͰίετʹ௚݁ʢ1core 900$/year,

    4500$/3-5yearʣ • VFPͷϓϩάϥϚϏϦςΟ͸ҡ͍࣋ͨ͠: ͢΂ͯΛΦϑϩʔυ͢Δඞཁ͸ͳ͍ʢྫɿ௨৴தͷϙϦγʔมߋʣ • SR-IOVͷ׆༻Ͱ௿஗Ԇɾ޿ଳҬɿGFTͰ࣮ݱ • SDN৽ػೳΛαϙʔτɿઃܭϩοΫΠϯΛܯռɻશͯͷطଘ؀ڥʹ΋৽ػೳΛಋೖ͍ͨ͠ • γϯάϧίωΫγϣϯͰߴύϑΥʔϚϯεɿCPUͰ͸ෆՄೳɻ෼ࢄ͸ΞϓϦมߋඞཁ • 100GbE+΁ͷ֦ுɿଳҬ΍VM͕૿͑ͯ΋αϙʔτͰ͖ΔΑ͏ʹɻ • อकੑʢϑϩʔঢ়ଶʣͷҡ࣋ɿϥΠϒϚΠάϨʔγϣϯ΍ϝϯςφϯεରԠͰ͖ΔΑ͏ʹɻ
  8. ̐ɽSmartNIC Hardware Design (1/3) Hardware Option 9 • NW ASICϕϯμʔͱڠྗͯ͠ઃܭҊΛ࡞͕ͬͨɺNIC͸ొ৔ͤͣɻɻ

    • 1st packet match͠ͳ͍ͱϗετଆ΁సૹ͞ΕΔઃܭ͕ड͚ೖΕΒΕͳ͔ͬͨɺΒ͍͠ • > ASICϕʔε • λΠϜϥάʢASIC׬੒·Ͱ1-2೥ɻαʔόݮ٫·Ͱ5೥ʣ΍ϕϯμʔϑΝʔϜ΢ΣΞఏڙ • > ϚϧνίΞSoC NIC • ίʔυͦͷ··ॻ͚Δɻ10G͸ྑ͔͕ͬͨ40GʹͳΔͱίΞεέδϡʔϥͰ஗ԆൃੜɻίΞ਺ര૿Ͱ100G+Ͱిྗɾίετෆ҆ • > DPDK • ύέοτॲཧίετ͸େ෯ʹԼ͛ΒΕΔ͕ɺCPUίΞͷίετ૿ɻϚϧνίΞSoCΑΓ஗͍ɻ • > FPGA • ϓϩάϥϚϏϦςΟΛҡ࣋ͭͭ͠ɺHWੑೳʢόε෯ɾετϨʔδ+ύΠϓϥΠϯॲཧʣΛ׆͔ͤͦ͏ɻCatapultͷ࣮੷΋͋Δ
  9. ̐ɽSmartNIC Hardware Design (2/3) Evaluating FPGA as SmartNICs 10 •

    ϩδοΫαΠζ͕ASICͷ10ഒ? -> ࣮ࡍ͸2-3ഒ • FPGA: SRAM, I/O + ΧελϜϩδοΫ૿, ASIC: ϓϩάϥϚϒϧϩδοΫ૿ • FPGAߴՁʁ -> Azureن໛ͳΒϖΠͰ͖Δ • FPGAϓϩάϥϛϯά೉͍͠ʁ • ޮ཰తͳύΠϓϥΠϯઃܭ࣌ʹͷΈߴੑೳΛൃشͰ͖Δ • CatapultνʔϜ͕ࢧԉɻAccelNetνʔϜ͸5ਓɻHW/SWڠௐઃܭϞσϧʢྫɿΞδϟΠϧ։ൃʣ • FPGA͸εέʔϧ͢Δ͔ʁ • CatapultνʔϜ͕ղܾࡁΈʢྫɿshellػߏʣ • Ҡ২ੑ͸ʁ • System VerilogͰॻ͍͍ͯΔɻҠ২ੑΛߟ͔͚͑ͯ͹Մೳɻ࣮ࡍXilinx -> Alteraʹมߋͨ͠ɻ SDNνʔϜʮFPGA࢖ͬͨ͜ͱແ͍͚Ͳɺຊ౰ʹ࢖͑Δͷʁʯ
  10. ̐ɽSmartNIC Hardware Design (3/3) SmartNIC System Architecture 11 • SR-IOV,

    RDMAΛ׬શ࣮૷͸ࠔ೉ • ·ͣ͸SDNΦϑϩʔυػೳͷΈ࣮૷ • Bump-in-the-Wireߏ੒ʹ૬৐Γ • (a) 2015-, 40G (Altera), (b) OCP 50G (intel) FPGA w/ NIC O ffl oadॲཧ SRIOVॲཧ VFPॲཧ
  11. 5. AccelNet System Design 12 • ιϑτ΢ΣΞઃܭ • Exception packet

    (≒1st packet)͸tagͬͯHost/VFP΁ • FPGAͰTCPऴ୺ʢSYN/RST/FINʣΛݕ஌͢ΔͱύέοτΛVFP΁౤͛ɺVFPʹTCP stateΛ௥੻ͤ͞Δ • FPGAύΠϓϥΠϯઃܭ • Match/Action, action͸recon fi gurable. Straits VD5 1/3 logic areaΛ࢖༻ • fl ow tracking, reconciliation • ack൪߸΍counter͸PCIeܦ༝ͰVFPʹಉظ • ϙϦγʔߋ৽ͱੈ୅؅ཧ΋VFPͱಉظ
  12. 6. Performance Results 13 • 40Gbps Gen1 SmartNIC + Xeon

    E5-2673v4, on same region Clos NW appੑೳͰ̎ഒ single connͰୡ੒ ޿ଳҬ ௿஗Ԇ ௿ϨΠςϯγ • ଞࣾൺֱʢ2017೥11݄ݱࡏʣ
  13. 7. Operationalization • ϝϯς΍live migrationͷͨΊɺঢ়ଶΛอ࣋ɾ෮ؼ͍ͤͨ͞ • VFΛ௚઀ݟͤͨ͘ͳ͍ • netvsc (Virtual

    Service Consumer)͕”଍”Λ੾Γସ͑ • transparent bonding (kernelʹίϛοτ) • DPDK PMDͰ௚઀͔ͭΉ৔߹͸ʁ • failsafe PMDΛ࣮૷ (dpdkʹίϛοτ) • RDMA • ʢҰൠతʹ͸Կ͔͋ΔͱʣTCPʹϑΥʔϧόοΫɻࠓͷͱ͜Ζ໰୊ʹ͸ͳ͍ͬͯͳ͍ • AccelNetελοΫͷߋ৽ϝϯςͰ্ه͕͏·͘ػೳ͍ͯ͠Δ • ҰॠͷjitterͰ΋໰୊ʹͳΔVM͸ɺϝλσʔλ෇༩Ͱࣄલmigration͔LBدͤ • ؂ࢹ • ϝτϦΫεɺFPGA্ͷશI/FͰpacket capture/trace/samplingͰ͖Δ 14
  14. • CPU͸࢖Θͳ͍: IaaSϏδωεͳͷͰίετʹ௚݁ʢ1core 900$/year, 4500$/3-5yearʣ • σʔλύεʹCPU࢖Θͳ͍ɻྫ֎ॲཧͰͷ࢖༻཰͸1%ະຬ • VFPͷϓϩάϥϚϏϦςΟ͸ҡ͍࣋ͨ͠: ͢΂ͯΛΦϑϩʔυ͢Δඞཁ͸ͳ͍ʢྫɿ௨৴தͷϙϦγʔมߋʣ

    • VFPΛͦͷ··ར༻ • SR-IOVͰ௿஗Ԇɾ޿ଳҬɿGFTͰ࣮ݱ • FPGAૠೖ஗Ԇ͸<1us, ϥΠϯϨʔτୡ੒ • SDN৽ػೳΛαϙʔτɿઃܭϩοΫΠϯΛܯռɻશͯͷطଘ؀ڥʹ΋৽ػೳΛಋೖ͍ͨ͠ • FPGA্ͷ৽͍͠ػೳʢactionͳͲʣΛ௥Ճɺߋ৽ɻશαʔόͰ࣮ߦɻ • γϯάϧίωΫγϣϯͰߴύϑΥʔϚϯεɿCPUͰ͸ෆՄೳɻ෼ࢄ͸ΞϓϦมߋඞཁ • Ͱ͖ͨ • 100GbE+΁ͷ֦ுɿଳҬ΍VM͕૿͑ͯ΋αϙʔτͰ͖ΔΑ͏ʹɻ • Ͱ͖ͦ͏ • อकੑʢϑϩʔঢ়ଶʣͷҡ࣋ɿϥΠϒϚΠάϨʔγϣϯ΍ϝϯςφϯεରԠͰ͖ΔΑ͏ʹɻ • production؀ڥͰԿ೥΋΍͍ͬͯΔ 8. Experiences (1/3) ໨ඪୡ੒ͨ͠ʁ->ͨ͠ 15
  15. • FPGA͸data center ready? • શͯͷΫϥ΢υ؀ڥͰbestͳιϦϡʔγϣϯͱ͸ݴ͑ͳ͍ • ΤίγεςϜ׬੒͢Δ·Ͱ͸େن໛Ϋϥ΢υϕϯμʔҎ֎ʹ͸޲͔ͳ͍ • HWνʔϜ(Catapult)ͷαϙʔτͳͲ͕͋Ε͹ྑ͍ɺ͕ɻɻɻ

    • FPGAͰ͸ग़དྷͨʢASICͰ͸Ͱ͖ͳ͍Ͱ͋Ζ͏ʣܧଓతͳ࣮૷มߋ • ಠࣗΧϓηϧԽ/OverlayɺNAT46, ύϑΥʔϚϯε޲্ (hashing/table insertion) 8. Experiences (2/3) FPGAશൠʹ͍ͭͯ 16
  16. • HW/SW/VMͷ౷߹͞ΕͨαʔϏεʢྫɿmigrationʣ΍؂ࢹɻޙ෇͸ෆՄೳ • HW/SW͸ಉ͡νʔϜॴଐʹ͢Δ • ιϑτ΢ΣΞςΫχοΫΛFPGA։ൃʹ࢖͏ • VFPͱFPGAΛผѻ͍ͰCI/CD͢ΔɻՄೳͳݶΓHWϩδοΫΛιϑτ΢ΣΞతʹѻ͏ • ݫີͳRTLݕূͰ͸ͳ͘ຊ൪؀ڥͰͷϝτϦΫεௐࠪɾγφϦΦ࣮ߦ

    • ύϑΥʔϚϯε޲্=৴པੑ޲্ɻϗετଆϦιʔεىҼͷҰ࣌తͳෆௐΛഉআ • HW/SW͸܁Γฦ͠վળॏཁɻ࣮ࡍͷϫʔΫϩʔυ͔ΒσʔλΛऩू͠վળΛ܁Γฦ͢ • hashing/caching͸ϦϦʔεޙʹԿ౓΋มߋͨ͠ • Ұ൪ނো͢Δͷ͸FGPA্ͷDRAM. ނো཰ࣗମ͸ଞͷύʔπͱมΘΒͣ • ্ҐϨΠϠʔʢVFPʣͰ͸ந৅Խ͢ΔɻΦϑϩʔυͱ͸෼͚ͯߟ͑Δɻ • CPUΛ࢖Θͳ͍ͷͰɺMeltdown/Spectre mitigationʹΑΔNWύϑΥʔϚϯεྼԽΛେ෯௿ݮ 8. Experiences (3/3) ֶΜͩ͜ͱɺڭ܇ 17 ղઆऀʹࢗͬͨ͞ ϙΠϯτൈਮ
  17. 10. Conclusion and Future Work • FPGAϕʔεͷAzure SmartNIC • VFPͱڠௐಈ࡞ͯ͠ωοτϫʔΫͷߴ଎ԽʢΦϑϩʔυʣ

    • ઃܭ΍ӡ༻Ͱͷ஌ݟ • શHVʹϓϩάϥϚϒϧNIC͕ࡌΔͱશ͘৽͍͠ػೳ͕࣮ݱ͢Δ͔΋ʁ 19
  18. ࡾߦ·ͱΊ • AzureͰ͸ϗετଆͷSDNιϑτ΢ΣΞʢAzure VFPʣͱڠௐಈ࡞͢ΔFPGA ϕʔεͷSmart NICΛ࣮૷ • 1st packet (ྫ֎ύέοτ)ͷΈϗετଆͰSWॲཧ(action)͠ɺͦͷaction಺༰ΛFPGAʹίϐʔͯ͠Ҏޙͷॲཧ͸HW

    (FPGA)Ͱॲཧ • 40Gbps؀ڥͰ୯Ұ௨৴(VM-VM௨৴)Ͱ32Gbps, 15usΛୡ੒ • Azureن໛ͳΒFPGAͰͷ৽ػೳ։ൃʢ৽encapํࣜ౳ʣ΍scaleՄೳ 20