Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to get to zero unhandled exceptions in production

Radoslav Stankov
September 06, 2019
280

How to get to zero unhandled exceptions in production

Practical tips and tricks about how to deal with exceptions in Ruby on Rails applications.

Video of the talk 👉https://www.youtube.com/watch?v=btUnSR-NGV0

Radoslav Stankov

September 06, 2019
Tweet

Transcript

  1. How to get to zero
    unhandled exceptions
    in production
    Radoslav Stankov 07/09/2019

    View Slide

  2. Radoslav Stankov
    @rstankov

    blog.rstankov.com

    github.com/rstankov

    twitter.com/rstankov

    View Slide

  3. View Slide

  4. View Slide

  5. View Slide

  6. View Slide

  7. View Slide

  8. View Slide


  9. ! Product Hunt Engineering Team "

    View Slide

  10. View Slide

  11. View Slide

  12. View Slide

  13. View Slide

  14. View Slide

  15. View Slide

  16. View Slide

  17. Happy Friday #
    $ Fix bugs
    % Goodies features
    & Pay technical dept
    ' Catchup on projects
    ( Fix exceptions

    View Slide

  18. Happy Friday #
    $ Fix bugs
    % Goodies features
    & Pay technical dept
    ' Catchup on projects
    ( Fix exceptions

    View Slide

  19. View Slide

  20. “Have process around exceptions.”

    ) Tip

    View Slide

  21. Back to basics

    View Slide


  22. http://exceptionalruby.com/


    View Slide

  23. def perform
    do_something
    rescue
    end

    View Slide

  24. View Slide

  25. def perform
    do_something
    rescue SpecificError
    end

    View Slide

  26. def perform
    do_something
    rescue SpecificError
    # NOTE(rstankov): Reason to return nil
    nil
    end
    Your name, not mine *

    View Slide

  27. def perform
    do_something
    rescue Timeout::Error
    # NOTE(rstankov): WiFi Sucks
    nil
    end

    View Slide

  28. “Be explicit around the exceptions. Handle
    specific errors and have explanations of why
    they happen.”

    ) Tip

    View Slide

  29. Monitoring

    View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. View Slide

  34. Raven.configure do |config|
    # Note(rstankov): Exclude unactionable errors
    config.excluded_exceptions = [
    'Rack::Timeout::RequestExpiryError',
    'Rack::Timeout::RequestTimeoutException',
    'ActionController::RoutingError',
    'ActionController::InvalidAuthenticityToken',
    'ActionDispatch::ParamsParser::ParseError',
    'Sidekiq::Shutdown',
    ]
    end

    View Slide

  35. View Slide

  36. ArgumentError: invalid byte sequence in UTF-8
    (

    View Slide

  37. View Slide

  38. View Slide

  39. # NOTE(rstankov): Fix invalid byte sequence in UTF-8. More info:
    # - https://robots.thoughtbot.com/fight-back-utf-8-invalid-byte-sequences
    module Handle::InvalidByteSequence
    extend self
    def call(string)
    return if string.nil?
    string.encode('UTF-8', 'binary',
    invalid: :replace,
    undef: :replace,
    replace: ''
    )
    end
    end

    View Slide

  40. “Reduce noise. See only exceptional errors in
    your tracker.”

    ) Tip

    View Slide

  41. View Slide

  42. View Slide

  43. account.subscription.status

    View Slide

  44. account.subscription&.status

    View Slide

  45. ✅ Check for other accounts without a subscription
    ✅ Find out why some accounts don't have a subscription
    ✅ Fix the root problem
    ✅ Add missing subscriptions to accounts
    , Steps to fix -

    View Slide

  46. View Slide

  47. View Slide

  48. if account.ship_subscription.blank?
    Raven.capture_message "Missing ship subscription", extra: { account_id: account.id }
    return :no_subscription
    end

    View Slide

  49. “Don't hide exceptions. Fix root causes.”

    ) Tip

    View Slide


  50. https://graphql.org/


    View Slide

  51. View Slide

  52. View Slide

  53. Which query causes
    this issue? .

    View Slide

  54. class Frontend::GraphqlController < Frontend::BaseController
    before_action :ensure_query
    def index
    render json: Graph::Schema.execute(query, variables: variables, context: context)
    rescue => e
    handle_error e
    end
    private
    # ...
    def handle_error(error)
    if Rails.env.development?
    logger.error error.message
    logger.error error.backtrace.join("\n")
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    elsif Rails.env.test?
    p error.message
    p error.backtrace
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    else
    Raven.capture_exception(e, extra: { query: query })
    render json: { error: { message: 'SERVER_ERROR' }, data: {} }, status: 500
    end
    end
    end

    View Slide

  55. class Frontend::GraphqlController < Frontend::BaseController
    before_action :ensure_query
    def index
    render json: Graph::Schema.execute(query, variables: variables, context: context)
    rescue => e
    handle_error e
    end
    private
    # ...
    def handle_error(error)
    if Rails.env.development?
    logger.error error.message
    logger.error error.backtrace.join("\n")
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    elsif Rails.env.test?
    p error.message
    p error.backtrace
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    else
    Raven.capture_exception(e, extra: { query: query })
    render json: { error: { message: 'SERVER_ERROR' }, data: {} }, status: 500
    end
    end
    end

    View Slide

  56. class Frontend::GraphqlController < Frontend::BaseController
    before_action :ensure_query
    def index
    render json: Graph::Schema.execute(query, variables: variables, context: context)
    rescue => e
    handle_error e
    end
    private
    # ...
    def handle_error(error)
    if Rails.env.development?
    logger.error error.message
    logger.error error.backtrace.join("\n")
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    elsif Rails.env.test?
    p error.message
    p error.backtrace
    render json: { error: { message: error.message, backtrace: error.backtrace } }, status: 500
    else
    Raven.capture_exception(e, extra: { query: query })
    render json: { error: { message: 'SERVER_ERROR' }, data: {} }, status: 500
    end
    end
    end

    View Slide

  57. View Slide

  58. “Invest in your monitoring and exception
    traceability. When you have a hard time racing
    an exception. Ask yourself - what more
    information I need? . Then add it.”

    ) Tip

    View Slide


  59. https://sidekiq.org/


    View Slide

  60. View Slide

  61. Use action Achievements::Job Achievement

    (unique)

    View Slide

  62. class Achievements::Job < ApplicationJob
    def perform(achievement, user)
    Achievements::Create.call achievement, user
    end
    end

    View Slide

  63. Use action Achievements::Job Achievement

    (unique)
    Use action Achievements::Job Achievement

    (unique)

    View Slide

  64. Use action Achievements::Job Achievement

    (unique)
    Use action Achievements::Job Achievement

    (unique)
    (

    View Slide

  65. Use action Achievements::Job Achievement

    (unique)
    Use action Achievements::Job PG::UniqueViolation
    (

    View Slide

  66. class Achievements::Job < ApplicationJob
    def perform(achievement, user)
    Achievements::Create.call achievement, user
    end
    end

    View Slide

  67. class Achievements::Job < ApplicationJob
    def perform(achievement, user)
    Handle::RaceCondition.call do
    Achievements::Create.call achievement, user
    end
    end
    end

    View Slide

  68. module Handle::RaceCondition
    extend self
    UNIQUE_ACTIVE_RECORD_ERROR = 'has already been taken'.freeze
    def call
    retries ||= 2
    yield
    rescue ActiveRecord::RecordNotUnique, PG::UniqueViolation
    retries -= 1
    raise unless retries.nonzero?
    retry
    rescue ActiveRecord::RecordInvalid => e
    raise unless e.message.include? UNIQUE_ACTIVE_RECORD_ERROR
    retries -= 1
    raise unless retries.nonzero?
    retry
    end
    end

    View Slide

  69. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)
    end
    end

    View Slide

  70. Errno::ECONNRESET
    (

    View Slide

  71. View Slide

  72. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)
    end
    end

    View Slide

  73. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)

    rescue Errno::ECONNRESET

    retry_job
    end
    end

    View Slide

  74. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)

    rescue Errno::ECONNRESET, EOFError

    retry_job
    end
    end

    View Slide

  75. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)

    rescue Errno::ECONNRESET, EOFError, Timeout::Error

    retry_job
    end
    end

    View Slide

  76. class Notifications::DeliverJob < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)

    rescue Errno::ECONNRESET, EOFError, Timeout::Error, ... /

    retry_job
    end
    end

    View Slide

  77. module Handle::NetworkErrors
    extend self
    ERRORS = [
    EOFError,
    Errno::ECONNRESET,
    Errno::EINVAL,
    Net::HTTPBadResponse,
    Net::HTTPHeaderSyntaxError,
    Net::ProtocolError,
    Timeout::Error,
    # ...
    ]
    def ===(error)
    ERRORS.any? { |error_class| error_class === error }
    end
    end

    View Slide

  78. class Notifications::Deliver < ApplicationJob
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)

    rescue Handle::NetworkErrors
    retry_job
    end
    end

    View Slide

  79. class Notifications::Deliver::Job < ApplicationJob
    retry_on(*::Handle::NetworkErrors::ERRORS, wait: 2.minutes, attempts: 5)
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)
    end
    end

    View Slide

  80. class Notifications::Deliver::Job < ApplicationJob
    include Handle::Job::NetworkErrors
    queue_as :notifications
    def perform(event)
    Notifications::Deliver.call(event)
    end
    end

    View Slide

  81. module Handle::Job::NetworkErrors
    def self.included(job)
    job.retry_on(*::Handle::NetworkErrors::ERRORS, wait: 2.minutes, attempts: 5)
    end
    end

    View Slide

  82. View Slide

  83. 0

    View Slide

  84. module Handle::NetworkErrors
    extend self
    ERRORS = [
    # ...
    Faraday::ConnectionFailed,
    Faraday::TimeoutError,
    RestClient::BadGateway,
    RestClient::BadRequest,
    # ...
    ]
    def ===(error)
    ERRORS.any? { |error_class| error_class === error }
    end
    end

    View Slide

  85. “Have tooling around handling common
    exceptions. Make it a no-brainer to process
    everyday errors.”

    ) Tip

    View Slide

  86. View Slide

  87. View Slide

  88. Recap

    View Slide

  89. View Slide

  90. ) Have process around exceptions.

    ) Be explicit around the exceptions

    ) Reduce noise

    ) Don't hide exceptions

    ) Invest in your monitoring

    ) Have tooling around handling common exceptions

    1 Recap

    View Slide

  91. View Slide

  92. View Slide

  93. Thanks 2

    View Slide

  94. https://speakerdeck.com/rstankov

    View Slide