Slide 1

Slide 1 text

Cassini + Goldstone DCI use case and challenges mixi, Inc. Toshiya Mabuchi Copyright © 2019 mixi, Inc.

Slide 2

Slide 2 text

l Core network has an MPLS L3-VPN with MP-BGP / LDP l Multitenant network by MPLS L3-vpn l High cost of leased line l 10GE * n leased line for DCI l Operational cost increases as number of devices increases l Don't want to increase the device as much as possible! l No DWDM operation experience until then l Long lead time for new leased DCI line l Was 2-4 months until now Copyright © 2019 mixi, Inc. mixi’s network background 1.5 years ago

Slide 3

Slide 3 text

Low cost & Flexible & Scalable l Optic module cost is very low l Can start minimal module / has large capacity Like a server operation l In-house operational tool development l Improves issue traceability Ease of future migration for part of the core network l Point-to-Point DWDM (Now production) l L3 core and DWDM (WIP) l MPLS P Router and DWDM (In the future) Why introduce of Cassini + Goldstone Copyright © 2019 mixi, Inc.

Slide 4

Slide 4 text

Together We Build Evaluation

Slide 5

Slide 5 text

l Stability test l packet forwarding (over 6 month) l Memory / CPU load / Temperature / DiskIO l Packet forwarding performance l IMIX / ICMP / L2 protocol … By T-Rex l Packet loss detection test l Operation workflow test Copyright © 2019 mixi, Inc. Evaluate phase Focused on DCI T-rex Eth Eth Eth Eth T-rex Eth Eth ~15km Cassini packet count check Cassini

Slide 6

Slide 6 text

At first, we want to deploy as a point-to-point DWDM But Cassini with Goldstone is switch + DSP • Cassini is not a DWDM transponder • Link fault pass through does not exist Core network IGP down detection becomes hold-timer dependent Copyright © 2019 mixi, Inc. Issue1: Transponder mode is not supported

Slide 7

Slide 7 text

Issue1 workaround and solution Copyright © 2019 mixi, Inc. Cassini Cassini Router Goldstone Goldstone Router OSFP BFD (1sec * 3) • Using BFD at End to End WIP Developing “transponderd” for high-speed detection without BFD Router Goldstone Router GoldStone Eth Eth Eth GoldStone Eth transponderd transponderd • Subscribe link state by netlink • Create fail detection group • Send ether hartbeart to member • send down request to member Down notify subscribe

Slide 8

Slide 8 text

Manual operation is difficult • Not designed for frequent Day2 config changes e.g. Typo by the operator in a command caused a Critical Error • Requires Day2 configuration stability • SONiC config validator • Set all Configs at first deployment Copyright © 2019 mixi, Inc. Issue2: SONiC is very delicate Syncd down by manual operation 2019-09-12.06:37:27.491030|s|SAI_OBJECT_TYPE_PORT:oid:0x1000000000013|SAI_PORT_ATTR_ADMIN_STATE=false 2019-09-12.06:37:27.491149|s|SAI_OBJECT_TYPE_PORT:oid:0x1000000000013|SAI_PORT_ATTR_SPEED=400000 2019-09-12.06:37:27.491252|s|SAI_OBJECT_TYPE_PORT:oid:0x1000000000013|SAI_PORT_ATTR_ADMIN_STATE=true 2019-09-12.06:37:27.493167|n|switch_shutdown_request|| Typo in configuration Syncd is down….

Slide 9

Slide 9 text

Cause 1: High DSP temperature • Fan control not implemented in ONL. Fixed in oopt v0.8 • Add temperature monitoring by olnpdump-binding Cause 2: Snmpd memory leak • Memory lead due to snmpd unsupported requests • Workaround: stop snmpd container • Use monitoring tools such as Prometheus node_exporter Copyright © 2019 mixi, Inc. Issue3: OS Hang-up! Leaked…

Slide 10

Slide 10 text

Deployed!

Slide 11

Slide 11 text

Current Cassini + Goldstone use case Site1 (external connection site) Site2 (Beremetal Application Server site 1) P/PE Core Cloud App servers Cassini Transit Cassini Databases Peers P/PE Core P/PE Core Cloud Goldstone Goldstone P/PE Core P/PE Core P/PE Core IGP bfd For mobile gaming backend network Production Copyright © 2019 mixi, Inc.

Slide 12

Slide 12 text

WIP: Migrate include Layer3 Routing Site1 (external connection site) Site2 (Beremetal Application Server site 1) P/PE Core Cloud App servers Transit Goldstone(as L3) Databases Peers P/PE Core P/PE Core Cloud Goldstone(as L3) Reduce external router • Utilization of SONiC + FRR • Use eBGP/iBGP for Backbone routing W IP SONiC FRR tai SONiC FRR tai SONiC FRR tai SONiC FRR tai Reduce! Copyright © 2019 mixi, Inc.

Slide 13

Slide 13 text

Next: As MPLS Backbone lean core Site1 (external connection site) Site2 (Beremetal Application Server site 1) PE Core Cloud App servers Transit Databases Peers PE Core Cloud • DWDM + MPLS Core network • As MPLS Provider router • Implement MPLS Interface into SAI and ASIC-SDK • Controll-Plane will use FRR-ldp Future SONiC FRR tai DWDM+MPLS DWDM+MPLS label SONiC FRR tai label SONiC FRR tai label SONiC FRR tai label PE Core • labeld develop for instead of syncd • Interface is SAI-MPLS Ldp&OSPF Ldp&OSPF Copyright © 2019 mixi, Inc.

Slide 14

Slide 14 text

1. Cost • Especially the cost-effectiveness of modules (ACO,DCO etc..) • More than 1/5 cost effective 2. Agility & Flexibility • Ease of roadmap planning • Can scale quickly and flexibly from small start to large scale • Has enough capacity for fast deployment 3. Open Architecture • Goldstone is a great OSS ecosystem (ONL, SONiC, SAI, TAI, gNMI etc..) • Server-like operation & monitoring (Prometheus, Ansible, Python tools..) • Improves trouble traceability. Why Goldstone and Cassini Copyright © 2019 mixi, Inc.

Slide 15

Slide 15 text

• Goldstone x Corenetwork more compatible • Goldstone will often be in demand on the core network • core netowork often uses MPLS , but SONiC does not support MPLS Chip vendor support is required for MPLS support • SONiC stability • Should support Day2 configure while providing some stability • More compatible hardware increases • Increased hardware support is also needed from a redundancy perspective Future Challenges Copyright © 2019 mixi, Inc.

Slide 16

Slide 16 text

• Deployed Cassini + Goldstone to DCI production • A one-year verification focusing on stability • Have some issues, but there are cost advantages and agility exceed that • Currently operating with Point-to-Point +BFD • Developing transponderd for devices that do not support bfd • Introducing a design to process L3 with Goldstone • Will develop component as MPLS Lean core with DWDM in the future • Flexible and core equipment reduction • Implementing SAI MPLS and connecting to the control plane will be a major challenge Summary Copyright © 2019 mixi, Inc.

Slide 17

Slide 17 text

Copyright © 2019 mixi, Inc.