や Ultra Ethernet 機能の実装方法| Kentaro Ebisawa 4 Scale-Up Scale-Out Scale-Across Within a Node or Rack Across Racks (Cluster) Across Data Centers Node GPU GPU GPU Network Fabric Node GPU GPU GPU Ethernet / InfiniBand Data Center GPU Cluster WAN / DCI Fabric Cross-DC Interconnect Server Node GPU GPU GPU GPU Scale-Up Switch (NVLink, / SUE / UALink) GPU GPU GPU GPU NVSwitch / SUE / UALink Data Center GPU Cluster
Group から探る ~ Scale-Up や Ultra Ethernet 機能の実装方法| Kentaro Ebisawa 5 Scale-Up Network Front-end Network CPU DPU XPU XPU NIC NIC NIC CPU DPU XPU XPU NIC NIC NIC Scale-Out Network
SONiC Workshop Japan 2026|2026-06-19|SONiC Scale-Up Working Group から探る ~ Scale-Up や Ultra Ethernet 機能の実装方法| Kentaro Ebisawa (参考)セマンティクスとは? • API の操作やデータのやり取りが持つ「意味」や「意図」 • 単なる構文(syntax、例えばAPIの引数)と異なり、セマン ティクスはその操作が何を意味するのか、どのような結果 が期待されるのか、に焦点を当てている Scale Out (InfiniBand, RoCE) “メッセージ” セマンティクス(message semantics) 操作:Send / Recv API Scale Up (NVLink, UALink, SUE) “メモリ” セマンティクス(memory semantics) 操作:Load / Store Operation それぞれメリット・デメリットがあり、組み合わせて利用される 8 XPU Core DATA NIC 送信 指示 XPU Memory (HBM) XPU Core NIC RoCE DATA DATA GPU Direct (1) (2) (3) (4) (5) (6) XPU Core DATA NIC XPU Memory (HBM) XPU Core NIC Scale UP DATA
Group から探る ~ Scale-Up や Ultra Ethernet 機能の実装方法| Kentaro Ebisawa • Alibaba の経験からは “Extreme LOW Latency” は必ずしも必要で はない(Computation time が必要なので、Latencyを減らしても ある一定以上は早くならない) 11 Alibaba example of why we need Scale Up (not only Scale Out) 2025-06-17 SONiC Scale Up WG: Reference model, Joy (Yijiao) Qin @Alibaba • 高帯域幅の重要性:DeepSeek-V3のような大規模モデルで は、データ転送量が多いため、高速な通信時間(Comm. Time)を実現する高帯域幅のネットワークが必要不可欠 • NVL72 による性能向上:従来の400Gbps InfiniBand NIC と比較して、900GB/sのNVL72 を使用することで、通信時 間が大幅に短縮される(129.96マイクロ秒から6.72マイクロ 秒へ) • 推論時間の短縮:広帯域幅のNVL72を使用することで、 DeepSeek-V3の総推論時間が14.76ミリ秒から0.82ミリ秒 へと劇的に短縮され、TPOT(Tokens Per Second)も67 トークン/秒から1200トークン/秒に向上 • ただし、Prefillなどステージ毎の影響も細かく考慮する必要あり • Scale Out で NIC 増やすのは、消費電力も課題になる
から探る ~ Scale-Up や Ultra Ethernet 機能の実装方法| Kentaro Ebisawa 14 5.3.2. ESUN Enhanced capabilities ESUN proposes use of a feature, Link Level Retry, defined by Ultra Ethernet Consortium (UEC) to achieve link resiliency against link level errors. 5.2. New ESUN Header Definition OCP ESUN - Network Operator Requirements - Base Specification 1.0 (2026-02-09) https://www.opencompute.org/documents/ocp-esun-network-operator-requirements-base-specification-rev-1-0-final-pdf This PR brings a generic framework to support multiple activities going on for defining the Optimized Forwarding Header like IEEE compressed Header, OCP ESUN Header, UEC Unified Forwarding Header and more. SAI: Optimized Forwarding Headers #2285 https://github.com/opencomputeproject/SAI/pull/2285 • Optimized Forwarding Headers (OFH) • Unified Forwarding Header (UFH)
~ Scale-Up や Ultra Ethernet 機能の実装方法| Kentaro Ebisawa 19 Scale-Up Working Group SONiC AI workgroup (2024) SONiC Scale Up workgroup (2025) https://lists.sonicfoundation.dev/g/SONiC-Scale-Up-WG https://lists.sonicfoundation.dev/g/sonic-wg-ai/
Scale-Up や Ultra Ethernet 機能の実装方法| Kentaro Ebisawa 21 Meeting Minutes: https://lists.sonicfoundation.dev/g/SONiC-Scale-Up-WG/wiki • 2025-09-16 Meta-X: Quantitative Approach to Scale Network from GPU Perspectives, Zhaoshi @Meta-X • 2025-09-23 ETH-X: Semantic and Transaction Layer for Ethernet-based Scale Up, Weifeng@Tencent • 2025-10-07 OCP SONiC Workshop slides about Scale Up WG, Riff and Eddie • 2025-10-14 ETH-X Test Report using Scale Up hardware prototype, Yijian @Shanghai UniVista • 2025-10-21 Alibaba UPN512: Ultra Performance Network for AI Scale-Up, Zhiping Yao @Alibaba • 2025-10-28 Tencent Scale up Rack design, Peter @Tensent • 2025-11-04 LLDP with LLR and CBFC, Venkat @Dell • 2025-11-11 ESUN introduction, Ian Cox @Broadcom • 2025-11-18 Chassis and SONiC Software module discussions, Haiyang Zheng @Alibaba • 2025-12-02 NVlink Fusion Intro by Krishnan Geeyarpuram from Nvidia • 2025-12-16 Scale-Up Architecture Doc discussion (rack level) • 2026-01-13 LLR/CBFC spec and SAI by Rupa, Marvell • 2026-01-27 Scale-Up AI Cluster Architecture Doc review (Eddie) • 2026-03-03 LLR HLD by Ravi from Marvell • 2026-03-10 Dual Rack System by Sean from Bytedance • 2026-03-24 Interesting topics from GTC 2026 • 2026-04-21 Test sub-group, Various Scale-Up Headers discussion (Eddie) • 2026-06-02 Test standard draft by Haiyan from Alibaba, Daisy from Keysight • Alibaba • Broadcom • Bytedance • Cisco • Dell • Keysight • Microsoft • Marvell • Meta-X • NVIDIA • Tencent • UniVista
2026|2026-06-19|SONiC Scale-Up Working Group から探る ~ Scale-Up や Ultra Ethernet 機能の実装方法| Kentaro Ebisawa 27 # show llr counters detailed Ethernet0 LLR Counters - Ethernet0 ----------------------- LLR_INIT CtrlOS Transmitted ............................. 1 LLR_INIT_ECHO CtrlOS Transmitted ............................. 1 LLR_ACK CtrlOS Transmitted ............................. 35000 LLR_NACK CtrlOS Transmitted ............................. 0 LLR Frames Transmitted OK .................................... 35000 LLR Frames Transmitted as poisoned ........................... 0 LLR Frames Discarded at Transmit ............................. 0 LLR Tx Replay Triggered Count ................................ 0 LLR_INIT CtrlOS Received ................................ 1 LLR_INIT_ECHO CtrlOS Received ................................ 1 LLR_ACK CtrlOS Received ................................ 15000 LLR_NACK CtrlOS Received ................................ 0 LLR_ACK/NACK CtrlOS Received with SeqNum error .............. 0 LLR Frames Received OK ....................................... 35000 LLR Frames Received as Poisoned .............................. 0 LLR Frames Received as Bad ................................... 0 LLR Rx Replay Detect Count ................................... 0 LLR Frames Received OK with expected seq num ................. 0 LLR Frames Received Poisoned with expected seq num ........... 0 LLR Frames Received Bad with expected seq num ................ 0 LLR Frames Received with Unexpected seq num .................. 0 LLR Frames Received with Duplicate seq num ................... 0