Slide 1

Slide 1 text

Peeking into Rails apps using OpenTelemetry 💎🔭 1 Prathamesh Sonpatki @last9io Ruby on Rails Global Summit’23

Slide 2

Slide 2 text

2 Instrumentation? 🤔 🤨

Slide 3

Slide 3 text

3 Instrumentation? 🤔 🤨 - How do you know your application is running as expected?

Slide 4

Slide 4 text

4 Instrumentation? 🤔 🤨 - How do you know your application is running as expected? - Service Level Agreements(SLA)

Slide 5

Slide 5 text

5 Instrumentation? 🤔 🤨 - How do you know your application is running as expected? - Service Level Agreements(SLA) - Good night’s sleep 😴 💤

Slide 6

Slide 6 text

💡“Hope is not a strategy!” 6 https://sre.google/sre-book/introduction/

Slide 7

Slide 7 text

💡A Reliability mandate starts with Instrumentation You can only improve what you measure. 7

Slide 8

Slide 8 text

🌈 Landscape of the Instrumentation 8

Slide 9

Slide 9 text

🌈 Landscape of the Instrumentation 9 - Your application is not standalone

Slide 10

Slide 10 text

🌈 Landscape of the Instrumentation 10 - Your application is not standalone - It’s actually a 🍔

Slide 11

Slide 11 text

🌈 Landscape of the Instrumentation 11 - Your application is not standalone - It’s actually a 🍔 - The Bun(Cloud/VM) - Patty(application) - Along with Mayo sauce(RDS/DB) - And Ketchup(Third party services)

Slide 12

Slide 12 text

🌈 The Landscape of Instrumentation 12 - Your application is not standalone - It’s actually a 🍔 - The Bun(Cloud/VM) - Patty(application) - Along with Mayo sauce(RDS/DB) - And Ketchup(Third party services) “Full stack observability” FTW!

Slide 13

Slide 13 text

💡Modern applications are like living organisms that grow and shrink in all possible directions. And also communicate with their peers! 13

Slide 14

Slide 14 text

Agenda 📄 14 - Need of instrumentation/observability - Basic Terminology - What is OpenTelemetry - Otel Architecture - Otel + Rails - Way forward

Slide 15

Slide 15 text

Basic Terminology 󰣖 15

Slide 16

Slide 16 text

Observability 🪩 16 The measurement and attribution of performance in a complex software environment is called Observability. @realmeson10

Slide 17

Slide 17 text

Telemetry 📊 17 Collection of measurements or data at remote points and their automatic transmission to receiving equipment (telecommunication) for monitoring.

Slide 18

Slide 18 text

Logs ⛈ 18 - Why is something wrong?? - Structured vs unstructured - Easier to adopt but consistency is hard Started GET "/posts" for 192.168.0.102 at 2023-01-25 11:53:32 +0530 Processing by PostsController#index as HTML Post Load (0.3ms) SELECT "posts".* FROM "posts" Rendered posts/index.html.erb within layouts/application (3.0ms) Completed 200 OK in 16ms (Views: 14.0ms | ActiveRecord: 0.3ms)

Slide 19

Slide 19 text

Metrics 📊 19 - What is wrong? - Overview via aggregates - Consistency and adoption - Visualization using external tools https://guides.rubyonrails.org/active_support_instrumentation.html

Slide 20

Slide 20 text

Traces 󰙪 20 - Entire journey of a workflow. - Data that tracks an application request as it flows through the various parts of an application. - Provides context for what exactly happened “Tracing is just structured logging at DEBUG level. Would you run production logs on debug level? NO! ” - @nishantmodak

Slide 21

Slide 21 text

Span 🔧 21 - Basic building block of a trace - Span is equivalent to an event happening. - A tree of spans make a trace Image credit https://docs.splunk.com/Observability/apm/apm-spans-traces/traces-spans.html

Slide 22

Slide 22 text

Observability back-ends 📇 22 - A tool/service that gets the telemetry data and presents way of visualization, aggregation, alerting, notifications.

Slide 23

Slide 23 text

Standardization Challenges 📜 23 - Different vendors - Different formats - Agents - Nomenclature https://last9.io/blog/observability-is-dead-long-live-observability/

Slide 24

Slide 24 text

OpenCensus 💻 OpenTracing 🪞 24

Slide 25

Slide 25 text

OpenTelemetry 🔭 25 https://www.cncf.io/blog/2021/08/26/opentelemetry-becomes-a-cncf-incubating-project/

Slide 26

Slide 26 text

What OpenTelemetry is not ⛔ 26 OpenTelemetry is not an observability back-end.

Slide 27

Slide 27 text

Architecture 🏛 27

Slide 28

Slide 28 text

Architecture Demo󰙥 28

Slide 29

Slide 29 text

Architecture 🏛 29

Slide 30

Slide 30 text

Architecture 🏛 30

Slide 31

Slide 31 text

Architecture 🏛 31 https://lightstep.com/blog/opentelemetry-collector-design-and- architecture

Slide 32

Slide 32 text

OTLP 🧬 - Describes the encoding, transport, and delivery mechanism of telemetry data between telemetry sources, intermediate nodes such as collectors and backends. - gRPC and HTTP implementations https://opentelemetry.io/docs/reference/specification/protocol/ 32

Slide 33

Slide 33 text

Automatic and Manual instrumentations ⚒ - Automatic instrumentation by just including relevant gems. - Out of the box telemetry data. - Common bases are covered. - Don’t have application specific data. - Getting started with auto-instrumentation is easiest. 33

Slide 34

Slide 34 text

Automatic and Manual instrumentations ⚒ - Automatic instrumentation by just including relevant gems. - Out of the box telemetry data. - Common bases are covered. - Don’t have application specific data. - Getting started with auto-instrumentation is easiest. 34 - Deeper insights into application. - Custom metadata. - More control. - Needs application code change.

Slide 35

Slide 35 text

Way forward 🛣 35

Slide 36

Slide 36 text

Way forward 🛣 36 - Ruby Otel SDK Complete support for traces - WIP for logs and metrics - https://github.com/open-telemetry/opentelemetry-ruby - https://opentelemetry.io/docs/instrumentation/ruby/

Slide 37

Slide 37 text

Who are already onboard on the Otel Train 🚝 37 - GitHub - Shopify - Heroku - Puppet - Dropbox - You??

Slide 38

Slide 38 text

Recap 💹 38 - Need for standardized instrumentation - What is the essence of OpenTelemetry? - Otel Architecture - Rails app using Otel - Automatic and Manual instrumentation - Way forward

Slide 39

Slide 39 text

Thanks 🤝 39 Prathamesh Sonpatki Last9.io 󰜼 prathamesh.tech 🐧 twitter.com/_cha1tanya 🐘hachyderm.io/@Prathamesh “Last9 of Reliability” Discord