Slide 1

Slide 1 text

Why does continuous profiling matter to developers? KubeCon NA Salt Lake City 2024 Co-located event: AppDeveloperCon

Slide 2

Slide 2 text

Who are we? Jonas Kunz Observability SW Engineer @ Elastic OpenTelemetry-Java Contributor Jonas Kunz @ CNCF-Slack Mauricio Salatino @Diagrid @Daprdev Application Development WG co-chair https://salaboy.com

Slide 3

Slide 3 text

Application Development WG Survey

Slide 4

Slide 4 text

Agenda - Building Cloud Native Resilient and Observable applications - Pillars of Observability - (Continuous) Profiling - Next steps

Slide 5

Slide 5 text

Distributed applications 101

Slide 6

Slide 6 text

Cloud Native distributed applications

Slide 7

Slide 7 text

Resilient and observable distributed applications

Slide 8

Slide 8 text

24k GitHub Stars 7.7k Discord Users 1M Pulls/month 3.7k Contributors 14/173 CNCF Projects 300k Doc views/month

Slide 9

Slide 9 text

Dapr: enabling developers with APIs - Stands for Distributed application Runtime - Uses a proxy to expose application-level APIs to solve common distributed challenges - We used two APIs for this examples: - Service to Service invocation - PubSub for async communications - All APIs cover cross-functional concerns such as resiliency, observability and security

Slide 10

Slide 10 text

With Dapr

Slide 11

Slide 11 text

Demo #1 Let’s use our apps!

Slide 12

Slide 12 text

The 3 Pillars of Observability Metrics Traces Logs since 2019

Slide 13

Slide 13 text

The 3 4 Pillars of Observability Metrics Traces Logs Profiles since 2022 since 2019

Slide 14

Slide 14 text

Profiling - Measuring where and how an application spends it’s time without having to modify/instrument it - “Time” can be many things - CPU-Time, Wall-Clock, IO-time, … - Profiling sees the world from OS-perspective (Threads, processes, OS-resources) App Process Profiler E.g. Linux Perf, Java Flight Recorder In-Process Profiling App Process OS-Kernel eBPF-Profiler eBPF Profiling E.g. opentelemetry-ebpf-profiler

Slide 15

Slide 15 text

Continuous-Profiling in Production Whole-System Visibility Unlock unknown-unknowns - from the kernel through userspace into high-level code, across multi-cloud workloads. Polyglot Visibility C/C++, Rust & Go (without debug symbols on host) PHP, Python, Java (or any JVM language), Ruby, DotNet, Perl & NodeJS. Extremely Low Overhead Continuous profiling in production with negligible overhead. Typical case: < 1% CPU, ~250MB of RAM - Optimyze.cloud launches low-overhead multi-runtime zero-instrumentation profiler in 2021 - Acquired by Elastic soon after - Donated to OpenTelemetry in 2024 - Continued development and evolution opentelemetry-ebpf-profiler

Slide 16

Slide 16 text

Recap deployment

Slide 17

Slide 17 text

Demo #2 Profiling in production

Slide 18

Slide 18 text

What’s ahead - Stabilization of the profiling signal (OTLP) - Stabilization of profiling support in the OTEL-collector - Standardization of trace - profiling correlation - More than CPU-profiling (e.g. IO, page-faults,etc) … and much more!

Slide 19

Slide 19 text

Before we go App Dev WG Survey

Slide 20

Slide 20 text

Thanks! Jonas Kunz (Jonas Kunz @ CNCF-Slack) Mauricio Salatino @diagridio @daprdev @salaboy

Slide 21

Slide 21 text

No content