Slide 1

Slide 1 text

Intro to Continuous Profiling and Grafana Pyroscope Steve Caron Staff Solutions Engineer, Grafana Labs

Slide 2

Slide 2 text

Once upon a time... M L T

Slide 3

Slide 3 text

M was relying on Metrics

Slide 4

Slide 4 text

L was relying on Logs

Slide 5

Slide 5 text

T was relying on Traces

Slide 6

Slide 6 text

Image credit: Oliver The Mighty Pig, Penguin Publishing Group ISBN:0803728867 Mighty P was using Profiling ...and spewing out flame graphs flame graph

Slide 7

Slide 7 text

Error logs pinpoint user issue Traces Metrics Logs Unexpected cpu spike Profiles Anomalous span reveals error cluster Code level root cause Profiling completes the story of why something went wrong and how to fix it

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

What is Profiling? “Profiling” is a way to analyze how a program uses resources like CPU or memory at code-level granularity. It makes use of flamegraphs to help you pinpoint the parts of your application that use the most resources. Commonly used during application development, built into popular IDEs. Challenges: ● The overhead of conventional profiles don’t allow for profiling in production ● On-demand profiling is a reactive approach ● Development environments don’t accurately mimic production.

Slide 10

Slide 10 text

What changed? Profiling technology has advanced. The overhead of today’s profiling technologies allows for it to run in production, with minimal overhead. This allows for “Continuous profiling” which is a more powerful version of profiling which profiles applications periodically, adding the dimension of time. By understanding your system’s resource usage over time, you can then locate, debug, and fix issues related to performance.

Slide 11

Slide 11 text

Cost cutting Getting a line-level breakdown of where resource hotspots are allows you to optimize them The value of Continuous Profiling Latency reduction Incident resolution For many businesses performance impact revenue - e-commerce, ads - gaming, streaming - HFT, fintech - rideshare Pinpoint memory leaks to specific parts of the code See root cause of CPU spikes See code level details when debugging services

Slide 12

Slide 12 text

How to gather a profile? ● Instrumenting the code base ○ Tooling and formats depending on each language ecosystem ○ Access to more detailed runtime information ○ More flexibility: ■ selectively profile and label specific sections of code ■ send profiles at different intervals (further read: eBPF pros/cons) ● eBPF based collection ○ No insights into stacktrace runtime information for interpreted languages (better fit for compiled languages) ○ Focus on CPU profiling ○ Live profiling: doesn’t require code change or even restarts ○ Kernel dependencies (v4.9 or more recent) and requires root access

Slide 13

Slide 13 text

How to gather a profile? Let’s take a look at Go ● Standard library includes CPU, Memory, Goroutine, Mutex and Block resources ● Provides profiles using a HTTP interface ○ Profiling data is returned using protobuf definition ● Data meant to be consumed by the pprof CLI ○ # Get a CPU profile over the last 2 seconds $ pprof "http://localhost:6060/debug/pprof/profile?seconds=2" # Get the heap memory allocations $ pprof "http://localhost:6060/debug/pprof/allocs" ○ Common to use the -http parameter to view profiles using the web interface ● Find more on Profiling in Go on https://pkg.go.dev/runtime/pprof#Profile

Slide 14

Slide 14 text

Instrumentation of Go code package main import ( "log" "net/http" _ "net/http/pprof" "time" ) func main() { go func() { log.Println(http.ListenAndServe("localhost:6060", nil)) }() // spend 3 cpu cycles doALot() doLittle() } [...]

Slide 15

Slide 15 text

What is measured in a profile? package main func main() { // work doALot() doLittle() } func prepare() { // work } func doALot() { prepare() // work } func doLittle() { prepare() // work }

Slide 16

Slide 16 text

What is measured in a profile? Time on CPU Each measurement gets recorded on a stack-trace level package main func main() { // spend 3 cpu cycles doALot() doLittle() } func prepare() { // spend 5 cpu cycles } func doALot() { prepare() // spend 20 cpu cycles } func doLittle() { prepare() // spend 5 cpu cycles } main() 3 main() > doALot() > prepare() 5 main() > doALot() 20 main() > doLittle() > prepare() 5 main() > doLittle() 5

Slide 17

Slide 17 text

Visualization of Profiles (try it yourself: flamegraph.com) Flamegraph ● Whole width represent the total resources used (over the whole measurement duration) ● Ability to spot higher usage nodes ● Colours are grouped based on package package main func main() { // spend 3 cpu cycles doALot() doLittle() } func prepare(x) { // spend 5 cpu cycles } func doALot(65) { prepare(65) // spend 20 cpu cycles } func doLittle(26) { prepare(26) // spend 5 cpu cycles }

Slide 18

Slide 18 text

What does “continuous profiling” look like? Resource usage over time Query Flamegraph & table

Slide 19

Slide 19 text

What does “continuous profiling” look like?

Slide 20

Slide 20 text

What does “continuous profiling” look like?

Slide 21

Slide 21 text

What does “continuous profiling” look like?

Slide 22

Slide 22 text

What does “continuous profiling” look like?

Slide 23

Slide 23 text

What does “continuous profiling” look like?

Slide 24

Slide 24 text

Grafana Pyroscope

Slide 25

Slide 25 text

2023: Pyroscope joined Grafana Labs +

Slide 26

Slide 26 text

How Pyroscope works?

Slide 27

Slide 27 text

Pyroscope architecture

Slide 28

Slide 28 text

What is our product today Open Source Project ~10,000 combined GitHub ⭐ Commercial Managed Offering An open source continuous profiling platform Grafana Cloud Profiles available in Grafana Cloud (available with free tier) ● Fully managed Grafana and observability solution

Slide 29

Slide 29 text

“Rideshare Company” demo app

Slide 30

Slide 30 text

Demo

Slide 31

Slide 31 text

Pyroscope resources: client documentation ● Client documentation - how to send profiles to Grafana

Slide 32

Slide 32 text

More resources examples in grafana/pyroscope #pyroscope on https://grafana.slack.com/ 📖 https://grafana.com/docs/pyroscope/latest/ https://play.grafana.org/a/grafana-pyroscope-app

Slide 33

Slide 33 text

Thank you! Questions?