Slide 1

Slide 1 text

Copyright © GREE, Inc. All Rights Reserved. Copyright © GREE, Inc. All Rights Reserved. No-Full Route Changed Our Lives @AS55394 Osamu Kurokochi, Data Center Team, Infrastructure Headquarters

Slide 2

Slide 2 text

Copyright © GREE, Inc. All Rights Reserved. Name Osamu Kurokochi Dept. Data Center Team, Infrastructure Headquarters, GREE, Inc Self Introduction

Slide 3

Slide 3 text

Copyright © GREE, Inc. All Rights Reserved. There were 8 border gateway protocol (BGP) routers. Full route reception was in operation on all routers, and each BGP router was connected as iBGP peers in a full mesh topology. Summer in 2012 (Configuration at the time) GREE environment R R R R R R R R

Slide 4

Slide 4 text

Copyright © GREE, Inc. All Rights Reserved. Summer in 2012 (occurrence of fault) One time, a fault on the Transit side occurred and caused peers to crash. Convergence of the routes at the time took time and the Router CPU froze. iBGP peers also began to crash which caused chaos. A shutdown of about 5 minutes lasted intermittently until convergence. (The only thing we could do was to watch what happened.)

Slide 5

Slide 5 text

Copyright © GREE, Inc. All Rights Reserved. Summer in 2012 (cause of fault) There were 3 main factors. 1. Insufficient hardware processing capability 2. Increased number of routes 3. Too many iBGP peers (not that many) Caused by one or a mixture of 3 factors above.

Slide 6

Slide 6 text

Copyright © GREE, Inc. All Rights Reserved. Solution 1. Reinforced hardware Buy hardware having better performance. Solution 2. Configuration change Introduce RR and reduce the number of iBGP- Peers. Solution 3. Decreased number of routes Decrease the number of routes with a mechanism to reduce the load during a BGP update. Breakthrough Solutions Considered at the Time

Slide 7

Slide 7 text

Copyright © GREE, Inc. All Rights Reserved. Key Judgment Point Replacement of BGP routers at all bases also was considered but is difficult in terms of effort. When a procedure “verification → order → delivery → maintenance arrangement” was considered, this remedy was too slow… It was judged that the problem was difficult to solve by introducing new hardware.

Slide 8

Slide 8 text

Copyright © GREE, Inc. All Rights Reserved. Key Judgment Point We narrowed the solutions down to solution 3. In our companyʼs business model, 99% of accesses were from mobile devices. The necessity of full route itself was reconsidered resulting in as follows: Full route → Partial route + Default route *Partial Route = 3 domestic mobile carriers and 5 ASs.

Slide 9

Slide 9 text

Copyright © GREE, Inc. All Rights Reserved. Transit Router Own router Transit Router 1 In-house filtering 2 TransitFilter method Full route Default route Partial route Default route RIB FIB Own router RIB FIB Partial Route Default Route Transparent GREE adopts this solution. There Are Two Partial Routes

Slide 10

Slide 10 text

Copyright © GREE, Inc. All Rights Reserved. Summary of Solutions Solution 1. Reinforced hardware Buy hardware having better performance. → Verification required and it takes time for delivery. Solution 2. Configuration change Implement RR to reduce the number of iBGP-Peers. → Verification required, it takes time for delivery and no conclusive evidence that the problem will be rectified. Solution 3. Decreased number of routes Lower the number of routes with a mechanism to reduce the load during a BGP update. → This can solve the problem in a short time and is reliable.

Slide 11

Slide 11 text

Copyright © GREE, Inc. All Rights Reserved. Solution of Problem Number of routes: At the time, 400,000 routes → reduced to approx. 2600 routes Come on, line trouble! We actually tried the solution. These have been further reduced to approx. 1800 routes.

Slide 12

Slide 12 text

Copyright © GREE, Inc. All Rights Reserved. I thought... Even without full route, it is possible to continue the our business. Do not you think the meaning of that you have a complete route again?

Slide 13

Slide 13 text

Copyright © GREE, Inc. All Rights Reserved.