Upgrade to Pro — share decks privately, control downloads, hide ads and more …

envoy - Resilience -

envoy - Resilience -

Abstraction of Resilience of envoy, for LT.

Yasuhiro Murata

November 13, 2019
Tweet

More Decks by Yasuhiro Murata

Other Decks in Technology

Transcript

  1. envoy – Resilience -
    Cloud LT
    Yasuhiro Murata
    2019.11.13

    View full-size slide

  2. Resilience in envoy
    u 4 ideas of Resilience
    l Circuit Breaking
    l Automatic Retries
    l Health Checks
    l Backpressure

    View full-size slide

  3. Circuit Breaking ?

    View full-size slide

  4. Circuit Breaking ? - 1/2
    Fail quickly and apply back pressure downstream for your
    connections.
    https://www.envoyproxy.io/learn/

    View full-size slide

  5. Circuit Breaking ? - 2/2
    How do you avoid a failure in one part of your infrastructure
    cascading into other parts of your infrastructure?
    One approach is to use circuit breaking.
    https://www.envoyproxy.io/learn/circuit-breaking

    View full-size slide

  6. How to configure ?

    View full-size slide

  7. How to configure ?
    circuit_breakers:
    thresholds:
    - priority: DEFAULT
    max_connections: 1000
    max_requests: 1000
    - priority: HIGH
    max_connections: 2000
    max_requests: 2000
    u priority
    l how routes defined as DEFAULT or HIGH are
    treated by the circuit breaker.
    u max_connections
    l the maximum number of connections that
    Envoy will make to our service clusters.
    u max_requests
    l the maximum number of parallel requests
    that Envoy makes to our service clusters.

    View full-size slide

  8. Breaking policy with HTTP/1.1 and HTTP/2
    u HTTP/1.1 and HTTP/2 have different connection behaviors - one
    connection per request vs many requests per connection
    l For HTTP/1.1 connections, use max_connections.
    l For HTTP/2 connections, use max_requests.

    View full-size slide

  9. Break-on-latency
    u Envoy doesn’t directly provide an option to trip the breaker on
    latency, but...
    l reduce the latency threshold for retries and enable circuit break on lots of retries using
    the max_retries option

    View full-size slide

  10. PASSIVE Health Checking ?

    View full-size slide

  11. PASSIVE Health Checking ?
    Outlier Detection
    https://www.envoyproxy.io/learn/health-check

    View full-size slide

  12. Outlier Detection
    u Uses the responses from real requests to determine whether an
    endpoint is healthy
    l Remove or Re-insert hosts using a time-out based approach
    consecutive_5xx: "3”
    base_ejection_time: "30s"
    l This configuration removes a host for 30 seconds when it returns 3 consecutive 5xx errors

    View full-size slide