Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[RailsConf 2019] Troubleshoot Your RoR Microservices with Distributed Tracing

[RailsConf 2019] Troubleshoot Your RoR Microservices with Distributed Tracing

In microservices architecture, it is often challenging to understand interaction and dependencies between individual components involved in an end-user request. Distributed tracing is a technique to improve observability of such microservices behaviors and to help understand performance bottlenecks, to investigate cascaded failures, or in general, to troubleshoot.

In this talk, I will show how we’ve implemented distributed tracing in Rails apps using OpenCensus, a set of vendor-neutral libraries to collect and export metrics and traces, and real world examples from our system that consists of about 100 microservices built with Ruby, Go, Python, Node and Rust.

If you feel pain in troubleshooting microservices of Rails, you’ll love distributed tracing.

Yoshinori Kawasaki

May 02, 2019
Tweet

More Decks by Yoshinori Kawasaki

Other Decks in Programming

Transcript

  1. ©2019 Wantedly, Inc.
    Troubleshoot
    Your RoR Microservices
    with Distributed Tracing
    RailsConf 2019 - Minneapolis
    May 2, 2019 - Yoshinori Kawasaki (@kawasy/@luvtechno)

    View Slide

  2. ©2019 Wantedly, Inc.
    Yoshinori Kawasaki
    Wantedly, Inc.
    @luvtechno
    @kawasyʢ೔ຊޠͰ0,ʣ

    View Slide

  3. ©2019 Wantedly, Inc.
    Tokyo based company. We “create a
    world where work drives passion,” by
    helping people uncover jobs they are
    passionate about, visit future
    teammates who share the same values,
    and meet an exciting company they’ll
    love working for.

    View Slide

  4. ©2019 Wantedly, Inc.

    View Slide

  5. ©2019 Wantedly, Inc.
    Microservices
    — or —
    Monolith

    View Slide

  6. ©2019 Wantedly, Inc.
    Please vote and RT https://twitter.com/luvtechno/

    View Slide

  7. ©2019 Wantedly, Inc.
    1. Why Distributed Tracing
    and how it helps
    2. What OpenCensus is
    and how to use it
    In this talk…

    View Slide

  8. ©2019 Wantedly, Inc.
    Distributed Tracing
    What it is and how it helps in microservices architecture

    View Slide

  9. ©2019 Wantedly, Inc.
    Microservices are great,
    but too hard?

    View Slide

  10. ©2019 Wantedly, Inc.
    Monolith

    View Slide

  11. ©2019 Wantedly, Inc.
    Microservices

    View Slide

  12. ©2019 Wantedly, Inc.
    Hard to tell components involved
    in a single end-user request

    View Slide

  13. ©2019 Wantedly, Inc.

    View Slide

  14. ©2019 Wantedly, Inc.
    A
    B
    C
    D
    E
    G
    F
    Feature X
    Feature Y
    Dependencies

    View Slide

  15. ©2019 Wantedly, Inc.
    A
    B
    C
    D
    E
    G
    F
    Feature X
    Feature Y
    Dependencies

    View Slide

  16. ©2019 Wantedly, Inc.
    No code change,
    but new errors.
    Why???
    When You See High Error Rate
    A
    B
    C
    D
    E
    G
    F
    Feature X
    Feature Y

    View Slide

  17. ©2019 Wantedly, Inc.
    When You See High Error Rate
    A
    B
    C
    D
    E
    G
    F
    Feature X
    Feature Y
    If you don’t know dependencies…
    ????
    ????

    View Slide

  18. ©2019 Wantedly, Inc.
    When You See High Error Rate
    A
    B
    C
    D
    E
    G
    F
    Feature X
    Feature Y
    You’ll have to speculate to find a root cause
    ????
    ????

    View Slide

  19. ©2019 Wantedly, Inc.
    When You See High Error Rate
    A
    B
    C
    D
    E
    G
    F
    Feature X
    Feature Y
    Even if you know dependencies…

    View Slide

  20. ©2019 Wantedly, Inc.
    When You See High Error Rate
    A
    B
    C
    D
    E
    G
    F
    Feature X
    Feature Y
    You still need to investigate
    which feature is affected

    View Slide

  21. ©2019 Wantedly, Inc.
    Distributed tracing helps you!

    View Slide

  22. ©2019 Wantedly, Inc.
    Distributed Tracing
    Time
    A
    C
    D
    F
    DB
    G

    View Slide

  23. ©2019 Wantedly, Inc.
    • A trace is a set of operations in an end-to-end request.
    • A trace has spans. A span represents an operation.
    • Spans can be nested, which represents causality.
    • A span has a name, and contains start and end time with other annotations.
    Distributed Tracing

    View Slide

  24. ©2019 Wantedly, Inc.
    Demo

    View Slide

  25. View Slide

  26. View Slide

  27. View Slide

  28. ©2019 Wantedly, Inc.
    OpenCensus
    What it is and how it helps in microservices architecture

    View Slide

  29. ©2019 Wantedly, Inc.
    OpenCensus is…
    a set of vendor-neutral libraries
    to collect and export
    traces and metrics.

    View Slide

  30. ©2019 Wantedly, Inc.
    1. Libraries for tracing and metrics
    • Java, Go, Node, Python, C++, C#, PHP, Ruby, Erlang/Elixir
    • Focus on capturing and sending data in your app, no analysis part provided.
    2. Exporter for backend(s) of your choice
    • Send to StackDriver, DataDog, Jaeger, Zipkin, Prometheus, etc.
    • Can export to multiple backends at once.
    3. Out-of-box integrations
    • Rack middleware for inbound request, Faraday middleware for outbound request.
    • Instrument events like ActiveRecord database calls, ActionView renders, etc
    OpenCensus

    View Slide

  31. ©2019 Wantedly, Inc.
    • trace_id: a 16-byte unique identifier for a trace.
    • span_id: a 8-byte unique identifier for a span.
    • parent_span_id: span_id of this span's parent span.
    • name: a description. Can be a method name, or a file name and a line num.
    • kind: UNSPECIFIED, SERVER, or CLIENT.
    • start_time, end_time: when a span starts and ends.
    • attributes: a set of key-value pairs. Value can be string, integer, double or bool.
    • stack_trace: A stack trace at the start.
    • time_events: a time-stamped annotation or send/rcv message event in a span.
    • links: a pointer from the current to another span in the same or a different trace.
    • status: a finally status of the span
    Span
    See https://github.com/census-instrumentation/opencensus-proto/blob/master/src/opencensus/proto/trace/v1/trace.proto

    View Slide

  32. ©2019 Wantedly, Inc.
    Architecture
    Backend
    Application
    Logic
    Collector Exporter(s)
    Backend
    Application
    Logic
    Collector Exporter(s)
    Application
    Logic
    Collector Exporter(s)
    HTTP request

    View Slide

  33. ©2019 Wantedly, Inc.
    OpenCensus Ruby
    Backend
    Rails Collector Exporter(s)
    Inbound HTTP request
    Rack
    Middleware
    Faraday
    Middleware
    Outbound HTTP request

    View Slide

  34. ©2019 Wantedly, Inc.
    # Gemfile
    gem 'opencensus'
    # When a process starts
    OpenCensus.configure do |c|
    c.trace.middleware_placement = :begin
    c.trace.exporter = exporter
    c.trace.default_sampler = \
    OpenCensus::Trace::Samplers::Probability.new(0.01)
    c.trace.default_max_attributes = 64
    end
    Configure Tracing

    View Slide

  35. ©2019 Wantedly, Inc.
    # DataDog exporter
    uri = URI.parse(ENV['DATADOG_APM_AGENT_URL'])
    c.exporter = OpenCensus::Trace::Exporters::Datadog.new \
    service: app_name,
    agent_hostname: uri.host,
    agent_port: uri.port
    # StackDriver exporter
    keyfile = Base64.strict_decode64(ENV['STACKDRIVER_JSON_KEY_BASE64'])
    c.exporter = OpenCensus::Trace::Exporters::Stackdriver.new \
    project_id: gcp_project_id,
    credentials: JSON.parse(keyfile)
    # multiple exporters
    c.exporter = OpenCensus::Trace::Exporters::Multi.new(*exporters)
    Configure Exporter

    View Slide

  36. ©2019 Wantedly, Inc.
    Rails Integration
    # application.rb
    require 'opencensus/trace/integrations/rails' # <- Rails::Railtie
    # the toplevel configuration object is exposed as `config.opencensus`
    config.opencensus.trace.default_max_attributes = 64

    View Slide

  37. ©2019 Wantedly, Inc.
    class OpenCensus::Trace::Integrations::RackMiddleware
    def call env
    formatter = Formatters::TraceContext.new
    context = formatter.deserialize env[formatter.rack_header_name]
    Trace.start_request_trace \
    trace_context: context,
    same_process_as_parent: false do |span_context|
    begin
    Trace.in_span get_path(env) do |span|
    start_request span, env
    @app.call(env).tap do |response|
    finish_request span, response
    end
    end
    ensure
    @exporter.export span_context.build_contained_spans
    end
    end
    end
    end
    Rack Middleware

    View Slide

  38. ©2019 Wantedly, Inc.
    DEFAULT_NOTIFICATION_EVENTS = [
    "sql.active_record",
    "render_template.action_view",
    "send_file.action_controller",
    "send_data.action_controller",
    "deliver.action_mailer"
    ].freeze
    def setup_notifications
    OpenCensus::Trace.configure.notifications.events.each do |type|
    ActiveSupport::Notifications.subscribe(type) do |*args|
    event = ActiveSupport::Notifications::Event.new(*args)
    handle_notification_event event
    end
    end
    end
    ActiveSupport::Notification

    View Slide

  39. ©2019 Wantedly, Inc.
    def handle_notification_event event
    span_context = OpenCensus::Trace.span_context
    if span_context
    ns = OpenCensus::Trace.configure.notifications.attribute_namespace
    span = span_context.start_span event.name, skip_frames: 2
    span.start_time = event.time
    span.end_time = event.end
    event.payload.each do |k, v|
    span.put_attribute "#{ns}#{k}", v.to_s
    end
    end
    end
    ActiveSupport::Notification (cont)
    See https://api.rubyonrails.org/classes/ActiveSupport/Notifications.html

    View Slide

  40. ©2019 Wantedly, Inc.
    class OpenCensus::Trace::IntegrationsFaradayMiddleware < ::Faraday::Middleware
    def call request_env
    span_context = request_env[:span_context]
    span_name = extract_span_name(request_env)
    span = span_context.start_span span_name, sampler: @sampler
    start_request span, request_env
    begin
    @app.call(request_env).on_complete do |response_env|
    finish_request span, response_env
    end
    rescue StandardError => e
    span.set_status 2, e.message
    raise
    ensure
    span_context.end_span span
    end
    end
    end
    Faraday Middleware

    View Slide

  41. ©2019 Wantedly, Inc.
    conn = Faraday.new(url: api_base_url) do |c|
    c.use OpenCensus::Trace::Integrations::FaradayMiddleware,
    span_name: ->(env) { env[:url].path }
    c.adapter Faraday.default_adapter
    end
    Faraday Middleware (cont)
    Recommend to do this in a private gem used in your rails apps.

    View Slide

  42. ©2019 Wantedly, Inc.
    Custom Span API
    OpenCensus::Trace.in_span "long task" do
    t = rand * 10
    sleep t
    end
    def in_span name, kind: nil, skip_frames: 0, sampler: nil
    span = start_span name, kind: kind, skip_frames: skip_frames + 1,
    sampler: sampler
    begin
    yield span
    ensure
    end_span span
    end
    end

    View Slide

  43. ©2019 Wantedly, Inc.
    OpenCensus + OpenTracing Merger
    https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0

    View Slide

  44. ©2019 Wantedly, Inc.
    https://storage.googleapis.com/open-source-software/OpenConsensus%20Roadmap.jpg

    View Slide

  45. ©2019 Wantedly, Inc.
    Summaryʢ·ͱΊʣ

    View Slide

  46. ©2019 Wantedly, Inc.
    1. Distributed tracing
    makes you productive
    in microservices
    architecture
    2. You can easily adopt
    distributed tracing
    using OpenCensus
    To summarize…

    View Slide

  47. ©2019 Wantedly, Inc.
    Thank you!
    @luvtechno
    @kawasyʢ೔ຊޠͰ0,ʣ

    View Slide

  48. ©2019 Wantedly, Inc.
    w !NVOJTZTUFN
    w !CHQBU
    w !J[VNJO
    w !"MUFDI
    Special Thanks

    View Slide

  49. ©2019 Wantedly, Inc.
    w 1BVM4LPSVQTLBTIUUQTVOTQMBTIDPNQIPUPT,-BY-C49"
    w (BCSJFM4BODIF[IUUQTVOTQMBTIDPNQIPUPT'@NP,:8:D
    w *BO4UBVGGFSIUUQTVOTQMBTIDPNQIPUPTC)L;ZB[#
    Photo Credit

    View Slide