Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Software engineering that supports LINE-original LBaaS

Software engineering that supports LINE-original LBaaS

Yutaro Hayakawa
LINE Network Development Team Infrastructure Engineer
https://linedevday.linecorp.com/jp/2019/sessions/F1-7

LINE DevDay 2019

November 20, 2019
Tweet

More Decks by LINE DevDay 2019

Other Decks in Technology

Transcript

  1. 2019 DevDay
    Software Engineering That
    Supports LINE-Original LBaaS
    > Yutaro Hayakawa
    > LINE Network Development Team Infrastructure
    Engineer

    View Slide

  2. Speaker : Yutaro Hayakawa
    > Joined to the LINE Network Development Team as a
    new graduate in this year
    > Working for Development and Operation for Load
    Balancers

    View Slide

  3. A Private Cloud Service for LINE Developers
    Verda
    > OpenStack Based Private Cloud
    > Since 2016

    View Slide

  4. Scale of Verda
    1400
    Hyper Visors
    20000+
    Last Year
    35000
    Virtual Machines

    View Slide

  5. A Private Cloud Service for LINE Developers
    Verda
    Compute Networking Storage
    K8S Kafka Redis MySQL
    IaaS
    Managed
    Services
    Function Platform


    PaaS
    Load
    Balancer

    View Slide

  6. Scales the LINE Applications
    Verda Load Balancer As A Service (LBaaS)
    VM
    Service A
    VM
    VM VM
    VM
    Service B
    VM
    VM VM
    VM
    Service C
    VM
    VM VM
    Virtually Dedicated
    Load Balancers
    Shared
    Load Balancer Cluster

    View Slide

  7. Two Types of Load Balancing
    For Different Requirements
    Server
    Server
    Server
    Req Req
    Req
    Req
    Server
    Server
    Server
    Req
    Req
    Req
    Req
    Layer7 Load Balancer (L7LB)
    a.k.a Reverse Proxy
    Layer4 Load Balancer (L4LB)

    View Slide

  8. Users of the Verda LBaaS
    Messanger
    Family Services
    LINE BLOG
    LINE Clova
    Text Message
    Videos
    Icon Images
    Ads
    Etc…

    View Slide

  9. > Prepare Certificates
    > Optimize Resource Allocation
    L7 Load Balancing Service
    L4 Load Balancing Service
    > Efficient and Fast Data Plane
    > Completely Developed From Scratch
    API Server and Orchestration Systems
    > Automation Friendly
    Not Just “Operating” on It
    We Are “Developing” an LBaaS

    View Slide

  10. Fundamental Problems
    Why?
    Operational
    Cost
    Availability
    Scalability

    View Slide

  11. Service B
    Service A
    In the Past
    From Beginning of the LINE to 2016
    Service C
    User Traffic
    L4/L7 LB
    HW Appliances
    1 + 1
    Active Standby
    Server Server Server
    Server Server Server
    Server Server Server
    Server Server Server

    View Slide

  12. Painful for Both of Operators and Users
    Operational Cost
    >Takes about 1 ~ 2 days for registering the backends
    >Cannot meet rapidly increasing demands
    >CLI based manual operation

    View Slide

  13. Scalability and Availability Issue
    Session Table Exhaustion Problem

    > Session Table

    • Remember Client ⁵ Backend Server Mappings 

    per TCP Connection


    > Doesn’t Scale With Large User Traffic


    > DoS Attack Causes Big Outage

    • TCP SYN Flood Attack
    Source Destination
    Client 1 Server A
    Client 2 Server D
    Client 3 Server A
    … …
    Client N Server X
    Session Table

    View Slide

  14. Rethink the Load Balancing
    Fully
    Automated
    Reduce the
    Failure Domain
    Scales With
    Large Traffic

    View Slide

  15. Ride on the Shoulders of “Tech Giants”
    Research and Development
    2010 2016
    We Were Here
    Google
    Maglev[5]
    Microsoft
    Ananta[1]
    Facebook
    Talk in SRECon[4]
    Cloudflare
    Blog Post[2]
    MS Research
    Duet[3]
    2019

    View Slide

  16. A New Architecture
    Multi Tier N + 1 Load Balancer Cluster
    Server
    Server
    Server
    Server
    Server
    Server
    L4LB Tier
    L7LB Tier
    Router Tier

    View Slide

  17. A New Architecture
    Multi Tier N + 1 Load Balancer Cluster
    Server
    Server
    Server
    Server
    Server
    Server
    L4LB Tier
    L7LB Tier
    Router Tier
    N + 1
    Active Active

    View Slide

  18. A New Architecture
    Multi Tier N + 1 Load Balancer Cluster
    Server
    Server
    Server
    Server
    Server
    Server
    L4LB Tier
    L7LB Tier
    Router Tier
    Stateless
    No Session Table

    View Slide

  19. A New Architecture
    Multi Tier N + 1 Load Balancer Cluster
    Server
    Server
    Server
    Server
    Server
    Server
    L4LB Tier
    L7LB Tier
    Router Tier
    Software
    Software

    View Slide

  20. Software Load Balancer
    Runs on Commodity Server Hardware
    $$
    10 Times Cheaper
    Than Appliances
    (Per 1 HTTPS Request) Operating Like a Server

    View Slide

  21. Controller Design
    > Ordinary Python Web Application

    • Provides API To Interact With Load
    Balancer Clusters

    • Fully Automated

    > User Interface

    • GUI, CLI, or Use API Directly

    > Common Authentication With OpenStack

    > Revision Management
    $ openstack verda lb create …
    $ openstack verda lb list

    View Slide

  22. L7 Load Balancer Tier Design
    VIP4

    VIP3
    Node2
    VIP4

    VIP3
    VIP6
    VIP5
    Node1

    VIP2
    VIP1
    VIP4
    VIP3
    Node2
    VIP4
    VIP3
    Cluster2
    VIP2
    VIP1 …


    Verda Region A
    Cluster1
    > Use k8s for Resource Scheduling

    • 2 Clusters For Each Verda Region

    (Active - Active)

    • k8s Node == Verda VM

    • Bind VIPs to Deployment
    VIP6
    VIP5
    VIP2
    VIP1
    Node1 (Verda VM)

    View Slide

  23. L7 Load Balancer Tier Design
    VIP2

    VIP1
    Node2
    VIP4

    VIP3
    VIP6
    VIP5
    Node1

    VIP2
    VIP1
    VIP4
    VIP3
    Node2
    VIP4
    VIP3
    Cluster2
    VIP2
    VIP1 …


    Verda Region A
    Cluster1
    > Use k8s for Resource Scheduling

    • 2 Clusters For Each Verda Region

    (Active - Active)

    • k8s Node == Verda VM

    • Bind VIPs to Deployment
    VIP4
    VIP3
    VIP2
    VIP1
    Node1 (Verda VM)
    VIP4
    VIP3
    VIP6
    VIP5

    View Slide

  24. L7 Load Balancer Tier Design
    Node1 (Verda VM)
    VIP2

    VIP1
    Node2
    VIP4

    VIP3
    VIP6
    VIP5
    Node1

    VIP2
    VIP1
    VIP4
    VIP3
    Node2
    VIP4
    VIP3
    Cluster2
    VIP2
    VIP1 …


    Verda Region A
    Cluster1
    > Use k8s for Resource Scheduling

    • 2 Clusters For Each Verda Region

    (Active - Active)

    • k8s Node == Verda VM

    • Bind VIPs to Deployment
    VIP4
    VIP3
    VIP2
    VIP1
    VIP4
    VIP3
    VIP6
    VIP5

    View Slide

  25. L7 Load Balancer Tier Design
    Node1 (Verda VM)
    VIP2

    VIP1
    Node2
    VIP4

    VIP3
    VIP6
    VIP5
    Node1

    VIP2
    VIP1
    VIP4
    VIP3
    Node2
    VIP4
    VIP3
    Cluster2
    VIP2
    VIP1 …


    Verda Region A
    Cluster1
    > Use k8s for Resource Scheduling

    • 2 Clusters For Each Verda Region

    (Active - Active)

    • k8s Node == Verda VM

    • Bind VIPs to Deployment
    VIP4
    VIP3
    VIP2
    VIP1
    VIP4
    VIP3
    VIP6
    VIP5

    View Slide

  26. L4 Load Balancer Tier Design
    L4LB Node1
    Cluster2


    Verda Region A
    Cluster1
    > Non-Orchestrated Physical Servers

    • Due to Special Network

    and Performance Requirements

    • VIP settings are replicated among

    multiple nodes

    • Fully Scratched Data Plane

    L4LB Node2
    VIP1 VIP2 VIP3 VIP4

    L4LB Node3
    VIP1 VIP2 VIP3 VIP4

    L4LB Node1
    VIP5 VIP6 VIP7

    L4LB Node2

    L4LB Node3

    VIP5 VIP6 VIP7
    VIP5 VIP6 VIP7
    Fully Scratched Data Plane

    View Slide

  27. L4 Load Balancer Tier Design
    L4LB Node1
    Cluster2


    Verda Region A
    Cluster1
    > Non-Orchestrated Physical Servers

    • Due to Special Network

    and Performance Requirements

    • VIP settings are replicated among

    multiple nodes

    • Fully Scratched Data Plane

    L4LB Node2
    VIP1 VIP2 VIP3 VIP4

    L4LB Node3
    VIP1 VIP2 VIP3 VIP4

    L4LB Node1
    VIP5 VIP6 VIP7

    L4LB Node2

    L4LB Node3

    VIP5 VIP6 VIP7
    VIP5 VIP6 VIP7
    Fully Scratched Data Plane

    View Slide

  28. Why Do We Need To
    Scratch the L4 Load Balancer?
    > The Common Problem of Software Based Load Balancer Is Performance

    > Performance Objective for Single L4LB Instance Was 7Mpps

    > Difficult To Achieve by Existing Load Balancer Software

    View Slide

  29. 500 LoC Fast L4 Load Balancer With XDP
    NIC Driver
    Protocol Stack
    XDP
    User App
    User
    Kernel
    Physical
    NIC
    XDP (eXpress Data Path)

    > “Fast Path” of Linux Network Stack
    • Hook Packets in NIC Driver

    > Able To Write the Packet

    Processing in Very Simple C Code

    > Statically Verifies the “Safety” of the Code

    C
    Code
    Packet
    Compile&
    Attach

    View Slide

  30. How to “Keep” the Performance?
    Continuous Performance Test
    >Performance Is a “Value” of the Service

    >We Need To Continuously Make Sure We Could Keep
    the Performance

    >Like CI/CD

    View Slide

  31. How To Do the Reproducible Performance Test?
    Fully Automated Performance Tests
    Traffic
    Generator
    Run Load Test
    PR
    Trigger
    Report Result
    GitHub Drone (CI/CD)
    Unified Test Environment
    Developer

    View Slide

  32. How To Do the Reproducible Performance Test?
    Fully Automated Performance Tests
    Traffic
    Generator
    Run Load Test
    PR
    Trigger
    Report Result
    GitHub Drone (CI/CD)
    Unified Test Environment
    Developer

    View Slide

  33. The Case of On-Demand
    Feature Implementation
    (My first task)

    View Slide

  34. Problem of the Stateless L4LB
    Drawback of the Stateless Approach
    Cannot Do
    “Graceful” Shutdown
    Difficult To Failover

    View Slide

  35. Problem of the Stateless L4LB
    Drawback of the Stateless Approach
    Packet Hash( 5tuple )
    Hash Value Destination
    0 Backend A
    1 Backend C
    2 Backend B
    3 Backend D
    … …
    ServerD
    ServerC
    ServerB
    ServerA
    1. Source IP
    2. Destination IP
    3. Source Port
    4. Destination Port
    5. Protocol Number

    View Slide

  36. Problem of the Stateless L4LB
    Drawback of the Stateless Approach
    Hash Value Destination
    0 Server A
    1 Server C
    2 Server B
    3 Server D
    … …
    ServerD
    ServerC
    ServerB
    ServerA
    Hash Value Destination
    0 Server D
    1 Server C
    2 Server B
    3 Server B
    … …
    Cannot Do “Graceful” Shutdown

    &
    Difficult To Failover

    View Slide

  37. Consistent Hashing
    > Special Type of the Hash Function
    Which Can Reduce the Possibility of
    Connection Disruption on Hash Table
    Update

    > We Use Maglev Hashing [5]

    View Slide

  38. Use Cases That
    Consistent Hashing Is Not Enough
    > Media Platform

    • Connection Suddenly Disrupted During File Upload, Playing Video …

    > Ads Platform

    • Miss the Ads Impression due to the Connection Reset …

    > The Problem Was a Blocker When Media Platform Migrated to Verda

    View Slide

  39. Session Caching
    Lookup
    Session cache
    Lookup
    Hash Table
    Update
    Session Cache
    Hash Value Destination
    0 Server A
    1 Server C
    2 Server B
    3 Server D
    … …
    Source Destination
    Client1 Server A
    Client2 Server C
    Client3 Server B
    Client4 Server D
    … …
    Miss
    Hit
    Client → Server Mapping Hash → Server Mapping
    Session Cache Hash Table

    View Slide

  40. Session Table Exhaustion Problem Again!

    View Slide

  41. Stateless vs Stateful
    > Stateless Hashing

    • Pros: Simple and Scalable

    • Cons: Difficult To Failover
    > Stateful Session Caching

    • Pros: Easy To Failover

    • Cons: Vulnerable to SYN Flood

    View Slide

  42. Solution
    Hybrid Approach
    > Detect the SYN Flood on the L4LB

    > Fallback to the Stateless Mode for a
    While When It Detects the SYN Flood
    Time
    SYN/s
    Threshold
    SYN Flood!!!

    View Slide

  43. Achievement
    > Took About 3 Months To Implement

    > Already Deployed to the Production

    > Media Platform Team Successfully Migrated to Verda

    View Slide

  44. Future Work

    View Slide

  45. Integrating With Other Verda Services
    > Integration With Managed Kubernetes
    Service

    > Use our LBaaS as an “Ingress” or

    “Type: Load Balancer”

    View Slide

  46. Adapt our Load Balancers to
    New Network Architecture
    > Native SRv6 Support
    > SRv6 Load Balancer (?)

    View Slide

  47. (Again) Ride on the Shoulders of “Tech Giants”
    Research and Development
    2012 2019
    We Are Here
    Google
    Magrev[5] Facebook
    Katran[7]
    Fastly
    Faild[6]
    Facebook
    Talk in SRECon[4]
    MS Research
    Duet[3]
    GitHub
    GLB[8]
    NEW!
    2018

    View Slide

  48. References
    [1] Patel, Parveen, et al. "Ananta: Cloud scale load balancing." ACM SIGCOMM Computer Communication
    Review. Vol. 43. No. 4. ACM, 2013.

    [2] https://blog.cloudflare.com/cloudflares-architecture-eliminating-single-p/

    [3] Gandhi, Rohan, et al. "Duet: Cloud scale load balancing with hardware and software." ACM SIGCOMM
    Computer Communication Review. Vol. 44. No. 4. ACM, 2014.

    [4] https://www.usenix.org/conference/srecon15/program/presentation/shuff

    [5] Eisenbud, Daniel E., et al. "Maglev: A fast and reliable software network load balancer." 13th {USENIX}
    Symposium on Networked Systems Design and Implementation ({NSDI} 16). 2016.

    [6] Araújo, João Taveira, et al. "Balancing on the edge: Transport affinity without network state." 15th {USENIX}
    Symposium on Networked Systems Design and Implementation ({NSDI} 18). 2018.

    [7] https://github.com/facebookincubator/katran

    [8] https://github.com/github/glb-director

    View Slide

  49. Thank You!

    View Slide