Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Resilient Systems with Circuit Breaker

Resilient Systems with Circuit Breaker

It's common for software systems to make remote calls to software running in different processes, probably on different machines across a network. One of the big differences between in-memory calls and remote calls is that remote calls can fail, or hang without a response until some timeout limit is reached. What's worse if you have many callers on a unresponsive supplier, then you can run out of critical resources leading to cascading failures across multiple systems. In his excellent book Release It, Michael Nygard popularized the Circuit Breaker pattern to prevent this kind of catastrophic cascade.

Guilherme Cavalcanti

July 28, 2014
Tweet

More Decks by Guilherme Cavalcanti

Other Decks in Programming

Transcript

  1. – Release It! “A resilient system keeps processing transactions, even

    when there are transient impulses, persistent stresses, or component failures disrupting normal processing.”
  2. Circuit Breaker • Nowadays it’s common to have apps composed

    by many other apps communicating via API • 3rd party services call will fail • Failure chain
  3. Initialization class CircuitBreaker! ! def initialize(options = {})! @timeout =

    options.fetch(:timeout, 0.01)! @failure_threshold = options.fetch(:failure_threshold, 5)! @silent = false! @failure_count = 0! end
  4. #handle • Strategy pattern • Circuit state • Try to

    execute and capture exceptions def handle(&block)! case state! when :closed then try_to_execute &block! when :open then handle_open! end! end
  5. #try to… • Failure counter • Tri timeout • Reset

    counter def try_to_execute(&block)! begin! yield_with_timeout(&block)! reset! rescue Timeout::Error! record_failure! raise $!! end! end! ! def yield_with_timeout(&b)! Timeout::timeout(@timeout, &b)! end! ! def reset! @failure_count = 0! end! ! def record_failure! @failure_count += 1! end!
  6. Using require "circuit_breaker"! ! circuit = CircuitBreaker.new! conn = Faraday.new!

    ! circuit.handle do! conn.get('http://api.foo.com/me.json')! end! ! circuit.handle do! conn.get('http://api.foo.com/accounts.json')! end
  7. .new class CircuitBreaker! ! def initialize(options = {})! @invocation_timeout =

    options.fetch(:timeout, 0.01)! @failure_threshold = options.fetch(:failure_threshold, 5)! @silent = false! @failure_count = 0! @reset_timeout = 0.1! @last_failure = nil ! end
  8. Mudanças • Try to execute • Last failure time •

    Reset last failure time def handle(&b)! case state! when :closed, :half_open! try_to_execute &b! when :open then handle_open! end! end! ! def try_to_execute(&block)! begin! yield_with_timeout(&block)! reset! rescue Timeout::Error! record_failure! raise $!! end! end! ! def yield_with_timeout(&b)! Timeout::timeout(@timeout, &b)! end! ! def reset! @failure_count = 0! @last_failure_time = nil! end! ! def record_failure! @last_failure_time = Time.now! @failure_count += 1! end!
  9. #STate def state! case! when (@failure_count >= @failure_threshold) &&! (Time.now

    - @last_failure_time) > @reset_timeout! :half_open! when (@failure_count >= @failure_threshold)! :open! else! :closed! end! end
  10. Improvements • Use AAsm or Statesman to manage state machine

    • Monitoring infrastructure using dependency injection • Distributed Circuit breaker • Detect errors other than timeout