NSDI 2023 https://www.usenix.org/conference/nsdi23/technical-sessions
Research Paper Introduction #47-48“NSDI 2023 recapͬΆ͍ͷ”௨ࢉ#117-118@cafenero_7772023/05/25, 06/08
View Slide
Agenda• NSDI 2023հ• ؾʹͳΔจͨͪʢͷ͞ΘΓʣΛհ• 23 papers
$ which• NSDI 2023• Boston, MA, USA, April 17-19, 2023• https://www.usenix.org/conference/nsdi23/technical-sessions• '23: 96/560 papers, acceptance rate: 17%• '22: 78/396 papers, acceptance rate: 19.7%• offlineͷΈʂʢڈॳͷhybrid։࠵ʣ• dual-trackܧଓ
Awards• Best Paper• LeakyScatter: A Frequency-Agile Directional Backscatter Network Above 100 GHz• CausalSim: A Causal Framework for Unbiased Trace-Driven Simulation• DOTE: Rethinking (Predictive) WAN Traffic Engineering• Community Award• Building Flexible, Low-Cost Wireless Access Networks With Magma
NSDI ’23 Technical Sessions• 2023/04/17• RDMA• Learning with GPUs• RPC and Remote Memory• Congestion Control• Distributed Systems• Wireless• Cloud• Internet-Scale Network• 2023/05/19• Programming the Network• Alternative Networks• Performance• Serverless and Network Functions• Real Networks• Cellular• Testing Physical Layer• 2023/04/18• Synthesis and Formal Methods• Data Centers• Systems for Learning• Privacy and Security• Video• Data• Making Systems Learn• IoT Networks23 tracks, 96sessions
ࢀߟɿNSDI ’22 Technical Sessions• 2022/04/04• Cluster Resource Management• Transport Layer - Part 1• Video Streaming• Programmable Switches - Part 1• Security and Privacy• Network Troubleshooting andDebugging• Operational Track - Part 1• Wireless - Part 1• 2022/04/06• Operational Track - Part 2• Edge IoT Applications• Cloud Scale Services• ISPs and CDNs• Cloud Scale Resource Management• Data Center Network Infrastructure• Multi-tenancy• Software Switching and Beyond• 2022/04/05• Reliable Distributed Systems• Raising the Bar for ProgrammableHardware• Testing and Verification• Programmable Switches - Part 2• Sketch-based Telemetry• Transport Layer - Part 2• Troubleshooting• Wireless - Part 224 tracks, 78sessions
ࢀߟɿNSDI '19 Technical Sessions• 2019/02/26• Host Networking• Distributed Systems• Modern Network Hardware• Analytics• Data Center Network Architecture• 2019/02/28• Network Characterization• Privacy and Security• Network Modeling• Wireless Applications• 2019/02/27• Wireless Technologies• Operating Systems• Monitoring and Diagnosis• Improving Machine Learning• Network Functions• Wireless Applications15 tracks, 50sessions
࠷ۙͷಈ• // ͔ࣗΒݟͨΒɺͷ• RDMAಠཱηογϣϯɻ࣮ӡ༻ʁ• Ӵ௨৴ɺಛఆಈը৴ಛԽʢtiktokεϫΠϓʣ• Φϑϩʔυܥ: ύέοτͦͷͷͰͳ͘ঢ়ଶ͚ͩΦϑϩʔυ• ػցֶशܥʢjob/resource sked.ʣ͍ͭ௨Γଟ͍ɺɺ• ແઢ௨৴׆گ
·ͱΊΔํ• ҙ• ʢࢲͷʣڵຯ͕͋ͬͨͷ͚ͩհ• ʢࢲͷʣཧղͰ͖ͨͷ͚ͩհ• ͪΌΜͱઆ໌͢Δͷ͕͍͠ͷͨͪ: NIC queue, Distributed system, AI/DL, Semantics, Verification, Compiler, Wireless, Edge/IoT• ͭ·Γɺ͍ͭͷʢࢲͷʣج४
Ξϯέʔτ• հͨ͠ͷͷͳ͔ͰɺڵຯΛͻ͔ΕΔͷΛ3ͭબΜͰ͍ͩ͘͞ɻ
Day 1
RDMA
SRNIC: A Scalable Architecture for RDMA NICsHong Kong University of Science and Technology, ByteDance, Unaffiliated• scalable RDMA NICΞʔΩςΫνϟ: SRNICͰεέʔϥϏϦςΟվળ• FPGAͰϓϩτλΠϓ࣮• QPs (Q Pairs)͕10kͰ҆ఆ• PFC free
Hostping: Diagnosing Intra-host Network Bottlenecks in RDMA ServersBUPT, Purple Mountain Laboratories, ByteDance Inc.• GPU w/ RDMAͰ100G~ʹͳΔͱϗετNW͕ϘτϧωοΫ• Hostping: RNICͱϗετEPͰϧʔϓόοΫςετͰԆͱଳҬΛஅɾੳ• طଘҎ֎ʹ৽ͨʹ6ͭϘτϧωοΫΛൃݟIntra-hostInter-host(Miss config.)
Understanding RDMA Microarchitecture Resources forPerformance IsolationDuke University, Microsoft, Shanghai Jiao Tong University• RDMAΛVM͝ͱʹੑೳisolation͍ͨ͠• RNICੑೳͰ͖ΔϚΠΫϩΞʔΩςΫνϟݱঢ়ଘࡏͤͣɻ• NVIDIA, Chelsio, Intelʹڞ༗ࡁΈɻ
Empowering Azure Storage with RDMAMicrosoft• AzureϦʔδϣϯͰRDMAετϨʔδΛαϙʔτ࢝͠Ίͨ• RDMAΛVM (HV), Storage྆ํͰ༗ޮԽɻregionDCؒͰ͏• NICͰDCQCN, sK-RDMAϓϩτίϧɺNWͰPFC/SONiC/SAI• RDMA over commodity Ethernet v2Λ͍ɺطଘΠϯϑϥΛ͏• 70%RDMAτϥϑΟοΫ
Learning with GPUs
Zeus: Understanding and Optimizing GPU Energy Consumption of DNN TrainingUniversity of Michigan• ֶश͕ྃ࣌ؒओ؟ɺΤωϧΪʔޮฦ͠• ΤωϧΪʔফඅྔͱτϨʔχϯά࣌ؒͷτϨʔυΦϑΛ໌Β͔ʹͨ͠
RPC and Remote Memory
Remote Procedure Call as a Managed System ServiceDukeUniversity, University of Washington, Shanghai Jiao Tong University• RPCΛ֤ΞϓϦͰ࣮͢ΔͷඇޮͳͷͰɺαʔϏεԽʢσʔϞϯԽʁʣͨ͠• mRPC: αΠυΧʔൺֱͰ2.5ഒɻॊೈੑ૿͢
Congestion Control
Bolt: Sub-RTT Congestion Control for Ultra-Low LatencyStanford University, Google LLC• 200G, 400G࣌ͷ੍ޚɻBDPʹऩ·Βͳ͍• SRCʢαϒRTT੍ޚʣͰૣ͘ʹؾͮ͘ɺProactive Ramp UpͰϑϩʔিಥΛ༧ݟͯ͠ػΛૉૣ͘༗͢Δ• Swift, HPCCൺͰ99%ileͷͪ࣌ؒΛ88%ॖɺFCTΛ3ഒվળ
Understanding the impact of host networking elements on traffic burstsJohns Hopkins University, Meta• eBPFͰτϥϑΟοΫॲཧͷՄࢹԽ• όʔετɺ੍ޚɺqdisc, sched. NIC-sched. HW-offload, protocol• [ns]͔Β[s]Φʔμʔ·ͰݟΕΔ
Distributed Systems
DiSh: Dynamic Shell-Script DistributionMIT, University of Pennsylvania, Purdue University, Brown University• DISH:• γΣϧεΫϦϓτͰࢄίϯϐϡʔςΟϯά͠Α͏ͥʂ• BashϕʔεͰɺࣗಈฒྻγεςϜར༻(PASH)ɺHDFS/Hadoop Streamingར༻
Wireless• Skip
Cloud
SkyPilot: An Intercloud Broker for Sky ComputingUniversity of California, Berkeley, UC Berkeley and ICSI• Sky of Computing = Inter cloud broker• ϫʔΫϩʔυ͝ͱʹҧ͏public cloudΛ͍͚Δ͜ͱͰɺίετϝϦοτʢ࣌ؒɺՁ֨ʣΛग़͢• cf: https://misreading.chat/2023/04/25/112-skypilot-an-intercloud-broker-for-sky-computing/
Invisinets: Removing Networking from Cloud NetworksUC Berkeley, Google, Microsoft• ΫϥυωοτϫʔΫར༻͢Δͷେม͗͢Δ• ςφϯτNWΛநԽͨ͠APIͷఏڙ• PRDO: Publicly Routable but Default Off• routingग़དྷΔ͕ɺσϑΥϧτdeny• શΤϯυϙΠϯτʹIPv6༩• ෳࡶ͞ͷ90%ΛݮͰ͖ͨ• Cf: https://misreading.chat/2023/05/18/114-invisinets-removing-networking-from-cloud-networks/
Internet-Scale Networks
xBGP: Faster Innovation in Routing ProtocolsICTEAM, UCLouvain, IIJ/Arrcus, Inc, NSG, ETH Zürich• BGPͷػೳՃ͍ɺ͕ɺૣ͍͍ͨ͘• ϕϯμʔχϡʔτϥϧͳAPIͱBGP࣮ͷ֦ு෦ΛeBPFͰఆٛɾ࣮• FRR/BIRDͰ࣮• Use case 7ͭհ: withdrawࣦഊ࣌ʹTSͰϧʔτഁغػೳɻϧʔτબํ๏ͷࢹͱڞ༗ɻൖ࣌ؒͷଌఆɻetc...• Cf: https://blog.apnic.net/2021/01/27/xbgp-toward-a-fully-extensible-bgp/873k route@IPv4120k route@IPv6
Ҏ߱ޙͰ
Day 2
Synthesis and Formal Methods• Skip
Data Centers
Flattened Clos: Designing High-performance Deadlock-freeExpander Data Center Networks Using Graph ContractionShanghai Jiao Tong University, Chinese Academy of Sciences• FC: Flattened ClosߏͷఏҊ• ToRΛཧతʹkݸʹ͚ɺྡԾUp-down pathΛ࡞Γɺflattenedͤ͞Δ• CBD-free routing
Systems for Learning
TOPOOPT: Co-optimizing Network Topology and ParallelizationStrategy for Distributed Training JobsMassachusetts Institute of Technology, Meta, CMU, Telescent• TOPOOPTτϙϩδͰ100G RDMAΛͬͯDNNֶश• Direct connect NW w/ ޫεΠον + ύονύωϧ + NPAR• Fat-TreeൺͰ3ഒ͘ɺ҆Ձ@12nodeֶशதͷ௨৴ύλʔϯ
Privacy and Security• Skip
Video• Skip
Data• Skip
Making Systems Learn• Skip
IoT Networks• Skip
Day 3
Programming the Network
A High-Speed Stateful Packet Processing Approach for TbpsProgrammable SwitchesKTH Royal Institute of Technology, Roma Tre University, UCLouvain• RDMAసૹ࣌ɺstateNFʹɾసૹ͢Δ• ͜ΕΛP4ͰΔ• 300GbpsΛୡ
ExoPlane: An Operating System for On-Rack Switch ResourceAugmentationMicrosoft, University of Texas at Austin, Carnegie Mellon University• In-network computing on Rack• ToR (P4)ͱSmartNICΛͬͯɺINCΛ࣮ݱɻಛʹstateཧΛ࿈ಈͯ͠Δ
RingLeader: Efficiently Offloading Intra-Server Orchestration to NICsGoogle, UT Austin• αʔόΦʔέετϨʔγϣϯʢsked.?ʣΛNIC assisted CPU sked.ͱ͢Δ• FPGAͰ࣮͠ɺtail-latency, throughput, CPU༻Λվળ
Alternative Networks• Skip
Performance
Skyplane: Optimizing Transfer Cost and Throughput Using Cloud-Aware OverlaysUniversity of California, Berkeley• Inter cloudͰόϧΫσʔλసૹγεςϜ• Ұ൪Ձ֨ޮ͕ྑ͍ํ๏Λݟ͚ͭΔʢSkyplane plannerʣ• ઢܗܭը๏Ͱղ͘• Ϋϥυ: ࠷େ4.6ഒ• Ϋϥυؒ: ࠷େ5.0ഒ
Electrode: Accelerating Distributed Protocols with eBPFHarvard University, Peking University, Cornell University• ࢄϓϩτίϧΛIn kernel (eBPF)Ͱ࣮• Context switch, NW stackͷΦʔόʔϔου͕ͳ͍• throughput 128%, latency 41%্
Serverless and Network Functions
Disaggregating Stateful Network FunctionsMicrosoft and AMD Pensando• ൚༻ARMίΞͱASICʢߴstateful match/actionʣΛ༻͍ͯɺॲཧΛϗετ͔ΒΓ͠ɺNFΛࢄԽ• 12NICϚγϯΛ࣮͠ɺNFੑೳ͕10ഒ্• Azureͷ࣮ӡ༻݁Ռͷհ
Real Networks• Skip
DOTE: Rethinking (Predictive) WAN Traffic EngineeringHebrew University of Jerusalem, Microsoft Research, Technion• Best paper !• DOTE: աڈͷσʔλͷΈΛͬͯDL͠ɺWAN TE͢Δ• Direct Optimization for Traffic Engineering• धཁ༧ଌʢNot IPFIXͰࡉ͔͘ੳ or Not demand-basedʣͰͳ͘࠷దԽ• ֬࠷దԽ + ࣮ੈքରԠͷͨΊʹML/DL͏• ܭࢉ࣌ؒૣ͘ɺ݁Ռྑ͍• τϥϑΟοΫมԽোݎ࿚ੑྑ͍
Dashlet: Taming Swipe Uncertainty for Robust Short VideoStreamingPrinceton University• εϫΠϓͷλΠϛϯάʹಛԽͨ͠ϏσΦετϦʔϛϯάख๏վળ• videoϨίϝϯυͱ࿈ܞͨ͠όοϑΝϦϯάɺϏοτϨʔτվળͷ࣮• ϏσΦ্࣭Λ֬ೝ
Cellular• Skip
Testing
Norma: Towards Practical Network Load TestingNanjing University, Alibaba Group• pktgenͰग़དྷͯͳ͍͜ͱ• εςʔτϑϧ/ϦΞϧͳτϥϑΟοΫ• TbpsͳଳҬͱϨʔτ੍ޚ• Norma: Programmable SW ASIC (Tofino w/ P4 1kߦ*)Ͱ࡞ͬͨ• 3TbpsͷTCP, 1TbpsͷHTTPτϥϑΟοΫΛੜ+ SWجຊػೳͰ8kߦ
Physical Layer• Skip
3ߦ·ͱΊ• RDMAͷ͍͍͢͝ɺproductionͰಈ͍ͯΔʂ• XDP/eBPFͷ࣮༻తͳ͍ํʢจॻͨ͘Ίͷੳπʔϧͱͯ͠ʁʣ• StateΛͷͲ͏ʹ͔ͯ͠ʢΦϑϩʔυͨ͠Γѹॖͨ͠ΓʣɺଳҬରԠ͢Δ
ͨ͠ײ• ͱʹ͔͘ྔଟ͗͢ʂʢҰͿΓೋʣ• Abstract/ConclusionಡΉ͚ͩͰ͠ΜͲ͍• ڈΑΓϚγ // ׳Ε͚ͨͩ• NSDIʹཧࢀՃ͔ͨͬͨ͠• ؾʹͳΔͷؾʹͳͬͨ࣌ʹಡΉͱྑ͍• ͏গ͠खΛಈ͔͍ͨ͠
EoP