Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[RailsConf 2019] Troubleshoot Your RoR Microservices with Distributed Tracing

[RailsConf 2019] Troubleshoot Your RoR Microservices with Distributed Tracing

In microservices architecture, it is often challenging to understand interaction and dependencies between individual components involved in an end-user request. Distributed tracing is a technique to improve observability of such microservices behaviors and to help understand performance bottlenecks, to investigate cascaded failures, or in general, to troubleshoot.

In this talk, I will show how we’ve implemented distributed tracing in Rails apps using OpenCensus, a set of vendor-neutral libraries to collect and export metrics and traces, and real world examples from our system that consists of about 100 microservices built with Ruby, Go, Python, Node and Rust.

If you feel pain in troubleshooting microservices of Rails, you’ll love distributed tracing.

Yoshinori Kawasaki

May 02, 2019

More Decks by Yoshinori Kawasaki

Other Decks in Programming


  1. ©2019 Wantedly, Inc. Troubleshoot Your RoR Microservices with Distributed Tracing

    RailsConf 2019 - Minneapolis May 2, 2019 - Yoshinori Kawasaki (@kawasy/@luvtechno)
  2. ©2019 Wantedly, Inc. Tokyo based company. We “create a world

    where work drives passion,” by helping people uncover jobs they are passionate about, visit future teammates who share the same values, and meet an exciting company they’ll love working for.
  3. ©2019 Wantedly, Inc. 1. Why Distributed Tracing and how it

    helps 2. What OpenCensus is and how to use it In this talk…
  4. ©2019 Wantedly, Inc. Distributed Tracing What it is and how

    it helps in microservices architecture
  5. ©2019 Wantedly, Inc. A B C D E G F

    Feature X Feature Y Dependencies
  6. ©2019 Wantedly, Inc. A B C D E G F

    Feature X Feature Y Dependencies
  7. ©2019 Wantedly, Inc. No code change, but new errors. Why???

    When You See High Error Rate A B C D E G F Feature X Feature Y
  8. ©2019 Wantedly, Inc. When You See High Error Rate A

    B C D E G F Feature X Feature Y If you don’t know dependencies… ???? ????
  9. ©2019 Wantedly, Inc. When You See High Error Rate A

    B C D E G F Feature X Feature Y You’ll have to speculate to find a root cause ???? ????
  10. ©2019 Wantedly, Inc. When You See High Error Rate A

    B C D E G F Feature X Feature Y Even if you know dependencies…
  11. ©2019 Wantedly, Inc. When You See High Error Rate A

    B C D E G F Feature X Feature Y You still need to investigate which feature is affected
  12. ©2019 Wantedly, Inc. • A trace is a set of

    operations in an end-to-end request. • A trace has spans. A span represents an operation. • Spans can be nested, which represents causality. • A span has a name, and contains start and end time with other annotations. Distributed Tracing
  13. ©2019 Wantedly, Inc. OpenCensus What it is and how it

    helps in microservices architecture
  14. ©2019 Wantedly, Inc. 1. Libraries for tracing and metrics •

    Java, Go, Node, Python, C++, C#, PHP, Ruby, Erlang/Elixir • Focus on capturing and sending data in your app, no analysis part provided. 2. Exporter for backend(s) of your choice • Send to StackDriver, DataDog, Jaeger, Zipkin, Prometheus, etc. • Can export to multiple backends at once. 3. Out-of-box integrations • Rack middleware for inbound request, Faraday middleware for outbound request. • Instrument events like ActiveRecord database calls, ActionView renders, etc OpenCensus
  15. ©2019 Wantedly, Inc. • trace_id: a 16-byte unique identifier for

    a trace. • span_id: a 8-byte unique identifier for a span. • parent_span_id: span_id of this span's parent span. • name: a description. Can be a method name, or a file name and a line num. • kind: UNSPECIFIED, SERVER, or CLIENT. • start_time, end_time: when a span starts and ends. • attributes: a set of key-value pairs. Value can be string, integer, double or bool. • stack_trace: A stack trace at the start. • time_events: a time-stamped annotation or send/rcv message event in a span. • links: a pointer from the current to another span in the same or a different trace. • status: a finally status of the span Span See https://github.com/census-instrumentation/opencensus-proto/blob/master/src/opencensus/proto/trace/v1/trace.proto
  16. ©2019 Wantedly, Inc. Architecture Backend Application Logic Collector Exporter(s) Backend

    Application Logic Collector Exporter(s) Application Logic Collector Exporter(s) HTTP request
  17. ©2019 Wantedly, Inc. OpenCensus Ruby Backend Rails Collector Exporter(s) Inbound

    HTTP request Rack Middleware Faraday Middleware Outbound HTTP request
  18. ©2019 Wantedly, Inc. # Gemfile gem 'opencensus' # When a

    process starts OpenCensus.configure do |c| c.trace.middleware_placement = :begin c.trace.exporter = exporter c.trace.default_sampler = \ OpenCensus::Trace::Samplers::Probability.new(0.01) c.trace.default_max_attributes = 64 end Configure Tracing
  19. ©2019 Wantedly, Inc. # DataDog exporter uri = URI.parse(ENV['DATADOG_APM_AGENT_URL']) c.exporter

    = OpenCensus::Trace::Exporters::Datadog.new \ service: app_name, agent_hostname: uri.host, agent_port: uri.port # StackDriver exporter keyfile = Base64.strict_decode64(ENV['STACKDRIVER_JSON_KEY_BASE64']) c.exporter = OpenCensus::Trace::Exporters::Stackdriver.new \ project_id: gcp_project_id, credentials: JSON.parse(keyfile) # multiple exporters c.exporter = OpenCensus::Trace::Exporters::Multi.new(*exporters) Configure Exporter
  20. ©2019 Wantedly, Inc. Rails Integration # application.rb require 'opencensus/trace/integrations/rails' #

    <- Rails::Railtie # the toplevel configuration object is exposed as `config.opencensus` config.opencensus.trace.default_max_attributes = 64
  21. ©2019 Wantedly, Inc. class OpenCensus::Trace::Integrations::RackMiddleware def call env formatter =

    Formatters::TraceContext.new context = formatter.deserialize env[formatter.rack_header_name] Trace.start_request_trace \ trace_context: context, same_process_as_parent: false do |span_context| begin Trace.in_span get_path(env) do |span| start_request span, env @app.call(env).tap do |response| finish_request span, response end end ensure @exporter.export span_context.build_contained_spans end end end end Rack Middleware
  22. ©2019 Wantedly, Inc. DEFAULT_NOTIFICATION_EVENTS = [ "sql.active_record", "render_template.action_view", "send_file.action_controller", "send_data.action_controller",

    "deliver.action_mailer" ].freeze def setup_notifications OpenCensus::Trace.configure.notifications.events.each do |type| ActiveSupport::Notifications.subscribe(type) do |*args| event = ActiveSupport::Notifications::Event.new(*args) handle_notification_event event end end end ActiveSupport::Notification
  23. ©2019 Wantedly, Inc. def handle_notification_event event span_context = OpenCensus::Trace.span_context if

    span_context ns = OpenCensus::Trace.configure.notifications.attribute_namespace span = span_context.start_span event.name, skip_frames: 2 span.start_time = event.time span.end_time = event.end event.payload.each do |k, v| span.put_attribute "#{ns}#{k}", v.to_s end end end ActiveSupport::Notification (cont) See https://api.rubyonrails.org/classes/ActiveSupport/Notifications.html
  24. ©2019 Wantedly, Inc. class OpenCensus::Trace::IntegrationsFaradayMiddleware < ::Faraday::Middleware def call request_env

    span_context = request_env[:span_context] span_name = extract_span_name(request_env) span = span_context.start_span span_name, sampler: @sampler start_request span, request_env begin @app.call(request_env).on_complete do |response_env| finish_request span, response_env end rescue StandardError => e span.set_status 2, e.message raise ensure span_context.end_span span end end end Faraday Middleware
  25. ©2019 Wantedly, Inc. conn = Faraday.new(url: api_base_url) do |c| c.use

    OpenCensus::Trace::Integrations::FaradayMiddleware, span_name: ->(env) { env[:url].path } c.adapter Faraday.default_adapter end Faraday Middleware (cont) Recommend to do this in a private gem used in your rails apps.
  26. ©2019 Wantedly, Inc. Custom Span API OpenCensus::Trace.in_span "long task" do

    t = rand * 10 sleep t end def in_span name, kind: nil, skip_frames: 0, sampler: nil span = start_span name, kind: kind, skip_frames: skip_frames + 1, sampler: sampler begin yield span ensure end_span span end end
  27. ©2019 Wantedly, Inc. 1. Distributed tracing makes you productive in

    microservices architecture 2. You can easily adopt distributed tracing using OpenCensus To summarize…