Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to get to zero unhandled exceptions in production

Radoslav Stankov
September 06, 2019
310

How to get to zero unhandled exceptions in production

Practical tips and tricks about how to deal with exceptions in Ruby on Rails applications.

Video of the talk 👉https://www.youtube.com/watch?v=btUnSR-NGV0

Radoslav Stankov

September 06, 2019
Tweet

Transcript

  1. How to get to zero
    unhandled exceptions
    in production
    Radoslav Stankov 07/09/2019

    View full-size slide

  2. Radoslav Stankov
    @rstankov

    blog.rstankov.com

    github.com/rstankov

    twitter.com/rstankov

    View full-size slide


  3. ! Product Hunt Engineering Team "

    View full-size slide

  4. Happy Friday #
    $ Fix bugs
    % Goodies features
    & Pay technical dept
    ' Catchup on projects
    ( Fix exceptions

    View full-size slide

  5. Happy Friday #
    $ Fix bugs
    % Goodies features
    & Pay technical dept
    ' Catchup on projects
    ( Fix exceptions

    View full-size slide

  6. “Have process around exceptions.”

    ) Tip

    View full-size slide

  7. Back to basics

    View full-size slide


  8. http://exceptionalruby.com/


    View full-size slide

  9. def perform
    do_something
    rescue
    end

    View full-size slide

  10. def perform
    do_something
    rescue SpecificError
    end

    View full-size slide

  11. def perform
    do_something
    rescue SpecificError
    # NOTE(rstankov): Reason to return nil
    nil
    end
    Your name, not mine *

    View full-size slide

  12. def perform
    do_something
    rescue Timeout::Error
    # NOTE(rstankov): WiFi Sucks
    nil
    end

    View full-size slide

  13. “Be explicit around the exceptions. Handle
    specific errors and have explanations of why
    they happen.”

    ) Tip

    View full-size slide

  14. Raven.configure do |config|
    # Note(rstankov): Exclude unactionable errors
    config.excluded_exceptions = [
    'Rack::Timeout::RequestExpiryError',
    'Rack::Timeout::RequestTimeoutException',
    'ActionController::RoutingError',
    'ActionController::InvalidAuthenticityToken',
    'ActionDispatch::ParamsParser::ParseError',
    'Sidekiq::Shutdown',
    ]
    end

    View full-size slide

  15. ArgumentError: invalid byte sequence in UTF-8
    (

    View full-size slide

  16. # NOTE(rstankov): Fix invalid byte sequence in UTF-8. More info:
    # - https://robots.thoughtbot.com/fight-back-utf-8-invalid-byte-sequences
    module Handle::InvalidByteSequence
    extend self
    def call(string)
    return if string.nil?
    string.encode('UTF-8', 'binary',
    invalid: :replace,
    undef: :replace,
    replace: ''
    )
    end
    end

    View full-size slide

  17. “Reduce noise. See only exceptional errors in
    your tracker.”

    ) Tip

    View full-size slide

  18. account.subscription.status

    View full-size slide

  19. account.subscription&.status

    View full-size slide

  20. ✅ Check for other accounts without a subscription
    ✅ Find out why some accounts don't have a subscription
    ✅ Fix the root problem
    ✅ Add missing subscriptions to accounts
    , Steps to fix -

    View full-size slide

  21. if account.ship_subscription.blank?
    Raven.capture_message "Missing ship subscription", extra: { account_id: account.id }
    return :no_subscription
    end

    View full-size slide

  22. “Don't hide exceptions. Fix root causes.”

    ) Tip

    View full-size slide


  23. https://graphql.org/


    View full-size slide

  24. Which query causes
    this issue? .

    View full-size slide

  25. class Frontend::GraphqlController < Frontend::BaseController
    before_action :ensure_query
    def index
    render json: Graph::Schema.execute(query, variables: variables, context: context)
    rescue => e
    handle_error e
    end
    private
    # ...
    def handle_error(error)
    if Rails.env.development?
    logger.error error.message
    logger.error error.backtrace.join("\n")
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    elsif Rails.env.test?
    p error.message
    p error.backtrace
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    else
    Raven.capture_exception(e, extra: { query: query })
    render json: { error: { message: 'SERVER_ERROR' }, data: {} }, status: 500
    end
    end
    end

    View full-size slide

  26. class Frontend::GraphqlController < Frontend::BaseController
    before_action :ensure_query
    def index
    render json: Graph::Schema.execute(query, variables: variables, context: context)
    rescue => e
    handle_error e
    end
    private
    # ...
    def handle_error(error)
    if Rails.env.development?
    logger.error error.message
    logger.error error.backtrace.join("\n")
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    elsif Rails.env.test?
    p error.message
    p error.backtrace
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    else
    Raven.capture_exception(e, extra: { query: query })
    render json: { error: { message: 'SERVER_ERROR' }, data: {} }, status: 500
    end
    end
    end

    View full-size slide

  27. class Frontend::GraphqlController < Frontend::BaseController
    before_action :ensure_query
    def index
    render json: Graph::Schema.execute(query, variables: variables, context: context)
    rescue => e
    handle_error e
    end
    private
    # ...
    def handle_error(error)
    if Rails.env.development?
    logger.error error.message
    logger.error error.backtrace.join("\n")
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    elsif Rails.env.test?
    p error.message
    p error.backtrace
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    else
    Raven.capture_exception(e, extra: { query: query })
    render json: { error: { message: 'SERVER_ERROR' }, data: {} }, status: 500
    end
    end
    end

    View full-size slide

  28. “Invest in your monitoring and exception
    traceability. When you have a hard time racing
    an exception. Ask yourself - what more
    information I need? . Then add it.”

    ) Tip

    View full-size slide


  29. https://sidekiq.org/


    View full-size slide

  30. Use action Achievements::Job Achievement

    (unique)

    View full-size slide

  31. class Achievements::Job < ApplicationJob
    def perform(achievement, user)
    Achievements::Create.call achievement, user
    end
    end

    View full-size slide

  32. Use action Achievements::Job Achievement

    (unique)
    Use action Achievements::Job Achievement

    (unique)

    View full-size slide

  33. Use action Achievements::Job Achievement

    (unique)
    Use action Achievements::Job Achievement

    (unique)
    (

    View full-size slide

  34. Use action Achievements::Job Achievement

    (unique)
    Use action Achievements::Job PG::UniqueViolation
    (

    View full-size slide

  35. class Achievements::Job < ApplicationJob
    def perform(achievement, user)
    Achievements::Create.call achievement, user
    end
    end

    View full-size slide

  36. class Achievements::Job < ApplicationJob
    def perform(achievement, user)
    Handle::RaceCondition.call do
    Achievements::Create.call achievement, user
    end
    end
    end

    View full-size slide

  37. module Handle::RaceCondition
    extend self
    UNIQUE_ACTIVE_RECORD_ERROR = 'has already been taken'.freeze
    def call
    retries ||= 2
    yield
    rescue ActiveRecord::RecordNotUnique, PG::UniqueViolation
    retries -= 1
    raise unless retries.nonzero?
    retry
    rescue ActiveRecord::RecordInvalid => e
    raise unless e.message.include? UNIQUE_ACTIVE_RECORD_ERROR
    retries -= 1
    raise unless retries.nonzero?
    retry
    end
    end

    View full-size slide

  38. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)
    end
    end

    View full-size slide

  39. Errno::ECONNRESET
    (

    View full-size slide

  40. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)
    end
    end

    View full-size slide

  41. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)

    rescue Errno::ECONNRESET

    retry_job
    end
    end

    View full-size slide

  42. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)

    rescue Errno::ECONNRESET, EOFError

    retry_job
    end
    end

    View full-size slide

  43. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)

    rescue Errno::ECONNRESET, EOFError, Timeout::Error

    retry_job
    end
    end

    View full-size slide

  44. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)

    rescue Errno::ECONNRESET, EOFError, Timeout::Error, ... /

    retry_job
    end
    end

    View full-size slide

  45. module Handle::NetworkErrors
    extend self
    ERRORS = [
    EOFError,
    Errno::ECONNRESET,
    Errno::EINVAL,
    Net::HTTPBadResponse,
    Net::HTTPHeaderSyntaxError,
    Net::ProtocolError,
    Timeout::Error,
    # ...
    ]
    def ===(error)
    ERRORS.any? { |error_class| error_class === error }
    end
    end

    View full-size slide

  46. class Notifications::Deliver < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)

    rescue Handle::NetworkErrors
    retry_job
    end
    end

    View full-size slide

  47. class Notifications::Deliver::Job < ApplicationJob
    retry_on(*::Handle::NetworkErrors::ERRORS, wait: 2.minutes, attempts: 5)
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)
    end
    end

    View full-size slide

  48. class Notifications::Deliver::Job < ApplicationJob
    include Handle::Job::NetworkErrors
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)
    end
    end

    View full-size slide

  49. module Handle::Job::NetworkErrors
    def self.included(job)
    job.retry_on(*::Handle::NetworkErrors::ERRORS, wait: 2.minutes, attempts: 5)
    end
    end

    View full-size slide

  50. module Handle::NetworkErrors
    extend self
    ERRORS = [
    # ...
    Faraday::ConnectionFailed,
    Faraday::TimeoutError,
    RestClient::BadGateway,
    RestClient::BadRequest,
    # ...
    ]
    def ===(error)
    ERRORS.any? { |error_class| error_class === error }
    end
    end

    View full-size slide

  51. “Have tooling around handling common
    exceptions. Make it a no-brainer to process
    everyday errors.”

    ) Tip

    View full-size slide

  52. ) Have process around exceptions.

    ) Be explicit around the exceptions

    ) Reduce noise

    ) Don't hide exceptions

    ) Invest in your monitoring

    ) Have tooling around handling common exceptions

    1 Recap

    View full-size slide

  53. https://speakerdeck.com/rstankov

    View full-size slide