$30 off During Our Annual Pro Sale. View Details »

A Deep Dive into Service Mesh and Istio - CNDT 2019

Shukun
July 23, 2019

A Deep Dive into Service Mesh and Istio - CNDT 2019

Cloud Native Days Tokyo 2019 / OpenStack Days Tokyo 2019(CNDT/OSDT) Day2 RoomCで発表したスライドです

Shukun

July 23, 2019
Tweet

More Decks by Shukun

Other Decks in Technology

Transcript

  1. A Deep Dive into
    Service Mesh and Istio
    0
    ~$ speaker
    Shukun SONG
    ~$ company
    FUJITSU LIMITED
    Copyright 2019 FUJITSU LIMITED
    CLOUDNATIVE DAYS TOKYO 2019 /
    OPENSTACK DAYS TOKYO 2019
    [2C1] 13:20 – 14:00 JULY 23th, 2019

    View Slide

  2. Outline
    Service mesh and Istio
     Architecture
     Packet flow
    Performance measurement of Istio
    Where the problem is
    Our trial
    Istio’s next feature
    Copyright 2019 FUJITSU LIMITED
    1

    View Slide

  3. Who am I?
    Shukun SONG
    Software engineer for OSS development
    team at Fujitsu.
    He focuses on improving the performance
    of Istio and rolling-upgrade for Kubernetes
    components.
    Copyright 2019 FUJITSU LIMITED
    2

    View Slide

  4. Outline
    Service mesh and Istio
     Architecture
     Packet flow
    Performance measurement of Istio
    Where the problem is
    Our trial
    Istio’s next feature
    Copyright 2019 FUJITSU LIMITED
    3

    View Slide

  5. Service Mesh
    What is service mesh?
     Network of microservices that make up service-to-service
    communication on cloud-based applications, containers and
    microservices with the following key features:
    Copyright 2019 FUJITSU LIMITED
     Traffic management:
    Service discovery, routing
     Observability:
    Telemetry
     Policies and Security:
    Policy check, Authentication,
    credential management
    4

    View Slide

  6. Overview of Istio
    What is Istio?
     A service mesh started by Lyft, Google and IBM.
    Characteristics
    Extensibility/pluggable
    Not CNCF project, run on K8s
    Copyright 2019 FUJITSU LIMITED
    5

    View Slide

  7. Istio’s Architecture
    Copyright 2019 FUJITSU LIMITED
    Data Plane
    Service A Service B
    Galley
    Citadel
    Mixer
    Pilot
    Control Plane
    Proxy Proxy
    6

    View Slide

  8. Istio’s Architecture
    Copyright 2019 FUJITSU LIMITED
    Data Plane
    Service A Service B
    Proxy Proxy
    = + (Mixer client)
    Proxy between applications
    Istio’s proxy includes envoy which
    is OSS service proxy for
    cloud native applications.
    7

    View Slide

  9. Istio’s Architecture
    Copyright 2019 FUJITSU LIMITED
    Data Plane
    Service A Service B
    Mixer
    Control Plane
    Proxy Proxy
    Observability and control
    Mixer policy controls and
    telemetry collection.
    8

    View Slide

  10. Service A Service B
    Control Plane
    Istio’s Architecture
    Copyright 2019 FUJITSU LIMITED
    Data Plane
    Pilot
    Proxy Proxy
    Traffic management
    Pilot configures all proxies to
    perform load balancing for
    requests, routing, and provides an
    API for service discovery.
    9

    View Slide

  11. Control Plane
    Service A Service B
    Istio’s Architecture
    Copyright 2019 FUJITSU LIMITED
    Data Plane
    Citadel
    CA
    CA
    Proxy Proxy
    Certification management
    Citadel provides key pair and
    manages certification for mTLS.
    10

    View Slide

  12. Istio’s Architecture
    Copyright 2019 FUJITSU LIMITED
    Data Plane
    Service A Service B
    Galley
    Mixer
    Pilot Config management
    Galley can validate Istio
    configuration with
    K8s ValidatingWebhook.
    Proxy Proxy
    Control Plane
    11

    View Slide

  13. Packet flows (except telemetry)
    Copyright 2019 FUJITSU LIMITED
    Service A
    Service A Service B


    :
    :
    Service A Service B
    Mixer istio-policy
    LB
    istio-ingress
    gateway
    Proxy Proxy
    12

    View Slide

  14. Outline
    Service mesh and Istio
     Architecture
     Packet flow
    Performance measurement of Istio
    Where the problem is
    Our trial
    Istio’s next feature
    Copyright 2019 FUJITSU LIMITED
    13

    View Slide

  15. Performance measurement of Istio
    Purpose
     Measure the time of one response
    Test environment
     Istio version: 1.0.5
     A k8s HA cluster with 3 masters and 5 workers
     VM spec: 8×2GHz Cores + 8GB Memory
     Use bookinfo sample app with 3 replicas
     Send 100 requests with 1RPS(Request per Second)
    Copyright 2019 FUJITSU LIMITED
    14

    View Slide

  16. Result of measurement
     Our result
     Community result
     Official report
    https://istio.io/docs/concepts/performance-and-scalability/
     Recent performance assessment
    https://drive.google.com/drive/folders/1Npndz_yUVXktocg5MUNgwA0lP1tNzrZX
    Copyright 2019 FUJITSU LIMITED
    Min[ms] Max[ms] Average[ms]
    With Istio 33.56 73.07 53
    Without Istio 22.9 50.82 36
    Mixer adds lots of overhead, both check and report
    15

    View Slide

  17. Where the problem is
     Data path
    ① Request
    ② Check send
    ③ Check response
    ④ Request
    ⑤ Report send
    ⑥ Report response
    Copyright 2019 FUJITSU LIMITED
    Service A
    Proxy
    Node
    1
    App pod
    Service B
    Proxy
    Node
    2
    App pod
    Node
    3
    Istio-policy Istio-telemetry






    16

    View Slide

  18. Mixer architecture
     Uniform abstraction of backends
     Envoy produces common attributes based on requests
     Mixer consumes attributes and talks to backends using adapters
     Backends provide functionality such as logging, monitoring and checking
    Copyright 2019 FUJITSU LIMITED
    Backend
    Backend
    Backend
    Adapter
    Adapter
    Adapter
    Mixer
    Attribute
    request.path: xyz/abc
    request.size: 234
    request.time: 12:34:56.789 04/17/2017
    source.ip: [192 168 0 1]
    destination.service.name: example
    Proxy
    17

    View Slide

  19. 1. Check
     Data path
    ① Request
    ② Check send
    ③ Check response
    ④ Request
    ⑤ Report send
    ⑥ Report response
    Copyright 2019 FUJITSU LIMITED
    Service A
    Proxy
    Node
    1
    App pod
    Service B
    Proxy
    Node
    2
    App pod
    Node
    3
    Istio-policy Istio-telemetry






    18

    View Slide

  20. Block until check is done
    To reduce traffic
     Cache
    Cannot really help as there
    are too many attributes, too
    many values, just make local
    proxy bigger and bigger.
     Turn off
    Check really do not matter, most purposes can be achieved by RBAC
    Default from v1.1
    1. Check
    Copyright 2019 FUJITSU LIMITED
    Adapter B
    Adapter C
    Adapter A
    Mixer
    A : xxx
    B : xx
    C : xx
    Attributes
    Proxy
    cache
    request.path: xyz/abc
    request.size: 234
    request.time: 12:34:56.789 04/17/2017
    source.ip: [192 168 0 1]
    destination.service.name: example
    request.path: xyz/abc
    request.size: 234
    request.time: 12:34:56.789 04/17/2017
    source.ip: [192 168 0 1]
    destination.service.name: example
    request.path: xyz/abc
    request.size: 234
    request.time: 12:34:56.789 04/17/2017
    source.ip: [192 168 0 1]
    destination.service.name: example
    cache
    cache
    cache
    19

    View Slide

  21. 2. Report
     Data path
    ① Request
    ② Check send
    ③ Check response
    ④ Request
    ⑤ Report send
    ⑥ Report response
    Copyright 2019 FUJITSU LIMITED
    Service A
    Proxy
    Node
    1
    App pod
    Service B
    Proxy
    Node
    2
    App pod
    Node
    3
    Istio-policy Istio-telemetry






    Mixer
    20

    View Slide

  22. 2. Report
    Happens after request done
    Batched, not happens every request.
    Not always affect performance but do harm it when report
    Currently, the default batch number has been reduced
    from 1000 to 100.
    Copyright 2019 FUJITSU LIMITED
    21

    View Slide

  23. Our trial
    Copyright 2019 FUJITSU LIMITED
    22

    View Slide

  24. Current structure
    Copyright 2019 FUJITSU LIMITED
    Service A
    Proxy
    Node
    1
    App pod
    Service B
    Proxy
    Node
    2
    App pod
    Node
    3
    Istio-policy Istio-telemetry






    Reduce
    this traffic
    Mixer
    23

    View Slide

  25. Our idea
     Reduce traffic between nodes
    Copyright 2019 FUJITSU LIMITED
    Service A
    Proxy
    Node
    1
    App pod
    Istio-policy Istio-telemetry






    Service B
    Proxy
    Node
    2
    App pod
    Istio-policy Istio-telemetry
    Mixer Mixer
    24

    View Slide

  26. Reference (IBM)
     Measurement
     Environment
    - Node A: 1 CPU(2.2GHz) / 4GB RAM
    - Node B: 2 CPUs(2.2GHz) / 4GB RAM
    - Node C: 4 CPUs(2.2GHz) / 8GB RAM
     Run mixc on a node, access to
    istio-telemetry through service
    and send 100k requests
     Result
     Average report time 3.9ms
    Copyright 2019 FUJITSU LIMITED
    Node B
    Istio-telemetry
    Node A
    Istio-telemetry
    mixc
    Node C
    Istio-telemetry
    istio-telemetry.istio-system.svc.cluster.local:15004
    25

    View Slide

  27. Reference (IBM)
     Measurement
     Not use mixc, but write a
    client call report function
    directly and serially
     Result
     Compare to the result of
    3.9ms, report to local mixer
    can reduce 1.1~2.4ms latency
    Copyright 2019 FUJITSU LIMITED
    Node B
    Istio-telemetry
    Node A
    Istio-telemetry
    Client
    Node C
    Istio-telemetry
    Client
    Client
    26

    View Slide

  28. Distributed Mixer
    Points
    ① Create a mixer on each node
    ② Ensure proxies communicate with the mixer on the same node
    Our trial
    Copyright 2019 FUJITSU LIMITED
    Node Mixer
    Proxy
    App
    Container
    Proxy
    App
    Container
    Proxy
    App
    Container

    Node Mixer
    Proxy
    App
    Container
    Proxy
    App
    Container
    Proxy
    App
    Container
    ① ①
    ② ② ② ② ② ②
    27

    View Slide

  29. Ensure proxies communicate with the mixer
    on the same node
    Copyright 2019 FUJITSU LIMITED
    Mixer Mixer
    Service

    Mixer Mixer

    Access to Mixer via service
    (Load balanced)
    Access to Mixer directly
    28

    View Slide

  30. Ensure proxies communicate with the mixer
    on the same node
    Copyright 2019 FUJITSU LIMITED
    Mixer Mixer
    Service

    Mixer Mixer

    Access to Mixer directly
    Sidecar deployment
    Envoy has to be deployed with
    configuration which specifying
    Mixer directly.
    Pilot-agent
    Template
    ---
    -----
    JSON
    ---
    -----
    Mixer
    29

    View Slide

  31. Ensure proxies communicate with the mixer
    on the same node
    Copyright 2019 FUJITSU LIMITED
    Mixer Mixer
    Service

    Mixer Mixer

    Access to Mixer directly
    Sidecar deployment
    Envoy has to be deployed with
    configuration which specifying
    Mixer directly.
    Pilot-agent
    Template
    ---
    -----
    Mixer
    JSON
    ---
    -----
    …,
    “dynamic_resources”: { // Ask Pilot },
    “static_resources”: {
    // Pilot information,
    // Add mixer’s information
    },…,
    30

    View Slide

  32. Our trial
    Configure Envoy
     Add static resource
    The procedure about creating configuration from template is written in
    the file istio.io/istio/pkg/bootstrap/bootstrap_config.go whose name is
    WriteBootstrap
    Copyright 2019 FUJITSU LIMITED
    // WriteBootstrap generates an envoy config based on config and epoch, and returns the filename.
    // TODO: in v2 some of the LDS ports (port, http_port) should be configured in the bootstrap.
    func WriteBootstrap(config *meshconfig.ProxyConfig, node string, epoch int,
    pilotSAN []string, opts map[string]interface{}, localEnv []string,
    nodeIPs []string, dnsRefreshRate string) (string, error){
    Add a little change to this function to enable pilot-agent recognize
    HOST_IP
    31

    View Slide

  33. Our trial
    Configure Envoy
     Figure out the process of check (In progress..)
    • Dump the config of a running Envoy sidecar
    • Envoy config is hard!

    Cluster, listener, router, filter, … So many new concepts over 7000 lines.
    • Istio creates a custom filter named mixer

    This filter is a mixer client responsible to interact with mixer server through
    gRPC

    Have no idea about how to write the config about this filter correctly
    Copyright 2019 FUJITSU LIMITED
    32

    View Slide

  34. Outline
    Service mesh and Istio
     Architecture
     Packet flow
    Performance measurement of Istio
    Where the problem is
    Our trial
    Istio’s next feature
    Copyright 2019 FUJITSU LIMITED
    33

    View Slide

  35. Istio’s next feature
    Extensibility v2 (formerly known as Mixer v2)
     Envoy communicates with backend directly to reduce overhead
    b/w Envoy and Mixer.
    Copyright 2019 FUJITSU LIMITED
    Mixer
    Backend
    Backend
    Backend
    Backend
    Backend
    Backend
    No change to Envoy
    No restart Envoy
    https://www.youtube.com/watch?v=XdWmm_mtVXI
    34

    View Slide

  36. View Slide