Slide 1

Slide 1 text

@osyoyu | 2024/4/25 RubyKaigi 2024ࣄલษڧձ ࠓ೥ͷRubyKaigi͸Profiler Year🤘

Slide 2

Slide 2 text

p @osyoyu • osyoyu (͓͠ΐʔΏ) • VISA ϓϦΧʮB/43ʯΛεϚʔτόϯΫ (ג) Ͱ࡞ͬͯ·͢

Slide 3

Slide 3 text

pp @osyoyu ʮ͏ͳ͹ΜʂʯϑΝʔετϥΠϒʂ ࠓि຤ 4/27 (౔) ໷ @ େԬࢁ όϯυϝϯόʔશһ Rubyist & RubyKaigi ࢀՃ ৄࡉ͸ "unasuke όϯυ" Ͱݕࡧʢunasuke = όϯυϝϯόʔʣ

Slide 4

Slide 4 text

ࠓ೥ͷRubyKaigi͸Profiler Year🤘

Slide 5

Slide 5 text

ϓϩϑΝΠϥ Stackprof, ruby-prof, rbspy, ... • ೔ຊޠͰ͸ʮੑೳղੳπʔϧʯ • ϓϩάϥϜதͷ "஗͍" ՕॴΛൃݟͰ͖Δ • ISUCON ͳͲͰ͓ੈ࿩ʹͳͬͨਓ΋ଟ͍ͷͰ͸ͳ͍Ͱ͠ΐ͏͔

Slide 6

Slide 6 text

Pf2 ࠷ڧศརϓϩϑΝΠϥΛ࡞͍ͬͯ·͢ • ৽͍͠ϓϩϑΝΠϥΛ࡞ͬͯ·͢ʂ • RubyKaigi Ͱ͸ Pf2 ͷ࢓૊Έ΍ɺ࡞Δ্Ͱͷ೉͠ ͞ͷ࿩Λ͠·͢ • Pf2 ͷ࢖͍͔ͨͷ࿩͸ࡢ೔ (4/23) ͷ Gotanda.rb #58 ͷࢿྉΛݟ͍ͯͩ͘͞ github.com/osyoyu/pf2

Slide 7

Slide 7 text

͜ͷίʔυͷ஗͍৔ॴɺΘ͔Γ·͔͢ 10,000,000,000ߦͷςΩετϑΝΠϧͷॲཧ "Roppongi;24.0" ͷΑ͏ͳ "౎ࢢ໊;ؾԹ" Λ10ԯߦಡΈࠐΈ·͢ ࡉ͔͍͜ͱ͸ The One Billion Row Challenge Ͱάά͍ͬͯͩ͘͞

Slide 8

Slide 8 text

Flamegraph ΛݟͯΈΑ͏ ԣ෯͕௕͍ϝιου͕࣮ߦ͕࣌ؒ௕͍΍ͭ

Slide 9

Slide 9 text

Flamegraph ΛݟͯΈΑ͏ ԣ෯͕௕͍ϝιου͕࣮ߦ͕࣌ؒ௕͍΍ͭ 👮

Slide 10

Slide 10 text

Pf2 ͳΒ Ruby ͷਅͷ࢟Λ๫͚Δ Ruby ͸ C Ͱ࡞ΒΕͯ·͢ rb_hash_aref = Hash#[]

Slide 11

Slide 11 text

Pf2 ͷ࢓૊Έ ΈΜͳڵຯ͋ΔΑͶ Sampling Pf2 is a sampling pro fi ler. This means that Pf2 collects samples of program execution periodically, instead of tracing every action (e.g. method invocations and returns). Pf2 uses the rb_pro fi le_thread_frames() API for sampling. When to do so is controlled by Schedulers, described in the following section. Schedulers Schedulers determine when to execute sample collection, based on con fi guration (time mode and interval). Pf2 has two schedulers available. SignalScheduler (Linux-only) The fi rst is the SignalScheduler, based on POSIX timers. Pf2 will use this scheduler when possible. SignalScheduler creates a POSIX timer for each Ruby Thread (the underlying pthread to be more accurate) using timer_create(3). This leaves the actual time-keeping to the OS, which is capable of tracking accurate per-thread CPU time usage. When the speci fi ed interval has arrived (the timer has expired), the OS delivers us a SIGALRM (note: Unlike setitimer(2), timer_create(3) allows us to choose which signal to be delivered, and Pf2 uses SIGALRM regardless of time mode). This is why the scheduler is named SignalScheduler. Signals are directed to Ruby Threads' underlying pthread, effectively "pausing" the Thread's activity. This routing is done using SIGEV_THREAD_ID, which is a Linux-only feature. Sample collection is done in the signal handler, which is expected to be more accurate, capturing the paused Thread's activity. This scheduler heavily relies on Ruby's 1:N Thread model (1 Ruby Threads is strongly tied to a native pthread). It will not work properly in MaNy (RUBY_MN_THREADS=1). TimerThreadScheduler Another scheduler is the TimerThreadScheduler, which maintains a time-keeping thread by itself. A new native thread (pthread on Linux/macOS) will be created, and an in fi nite loop will be run inside. After sleep(2)-ing for the speci fi ed interval time, sampling will be queued using Ruby's Postponed Job API. This scheduler is wall-time only, and does not support CPU-time based pro fi ling.

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

Day 1 14:50- དྷͯ͘Εʂʂʂ

Slide 14

Slide 14 text

ຊ೔

Slide 15

Slide 15 text

ࠓ೥ͷRubyKaigi͸Profiler Year✌

Slide 16

Slide 16 text

2022, 2023: ύʔαʔ 2024: ϓϩϑΝΠϥ

Slide 17

Slide 17 text

❌ ϓϩϑΝΠϥͷ࢖͍͔ͨ ʢΠϯλʔωοτʹॻ͍ͱ͖·ͨ͠ʣ ʢRubyKaigi Ͱฉ͖ʹདྷ͍ͯͩ͘͞ʣ ❌ Pf2 ͷৄ͍͠࿩

Slide 18

Slide 18 text

⭕ ͍͔ʹࠓ೥͕ Profiler Year Ͱ͋Δ͔

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

70

Slide 21

Slide 21 text

70 RubyKaigi 2024 ࢀՃऀͰɺϓϩϑΝΠϥΛ࡞͍ͬͯΔਓ

Slide 22

Slide 22 text

• ࢀՃऀ਺ 1400 • RubyKaigi 2023ͷࢀՃऀ਺ • ηογϣϯ਺ 52 • ͏ͪɺϓϩϑΝΠϥؔ࿈ηογϣϯ͸…… ਺ࣈͰݟΔ RubyKaigi 2024 ࠜڌΛ͝આ໌͠·͠ΐ͏

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

The depths of profiling Ruby Vernier: A next generation profiler for Ruby Optimizing Ruby: Building an Always-On Production Profiler ϓϩϑΝΠϥτʔΫ͕3ͭ (5%)

Slide 27

Slide 27 text

1400×(3/52) = RubyKaigi ࢀՃऀ਺ ϓϩϑΝΠϥτʔΫ཰

Slide 28

Slide 28 text

1400×(3/52) = RubyKaigi ࢀՃऀ਺ ϓϩϑΝΠϥτʔΫ཰ ϓϩϑΝΠϥΛ࡞ͬͯΔࢀՃऀͷ਺ 70

Slide 29

Slide 29 text

ϓϩϑΝΠϥτʔΫ׬શղઆ • The depths of pro fi ling Ruby (osyoyu) • ৽ϓϩϑΝΠϥ Pf2 ͷ঺հ • Vernier: A next generation pro fi ler for Ruby (jhawthorn) • ৽ϓϩϑΝΠϥ Vernier ͷ঺հ • Optimizing Ruby: Building an Always-On Production Pro fi ler (ivoanjo) • Datadog ૊ΈࠐΈͷϓϩϑΝΠϥͷ঺հ

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

? ͜ΜͳʹྲྀߦΔʁ

Slide 34

Slide 34 text

धཁͱڙڅ͕Ϛονͨ݁͠Ռ ೤ڰ͕ੜ·Ε͍ͯΔ

Slide 35

Slide 35 text

धཁͱڙڅΛಡΈղ͘ ಡΈղ͘ͱݴ͑͹ྺ࢙ धཁ = ϓϩϑΝΠϥʹͰ͖ͯ΄͍͜͠ͱ ڙڅ = ϓϩϑΝΠϥ͕Ͱ͖Δ͜ͱ

Slide 36

Slide 36 text

2009

Slide 37

Slide 37 text

Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ CPU ͷύϑΥʔϚϯεΧ΢ϯλ΋౷߹ͨ͠ڧྗͳϓϩϑΝΠϦϯά͕࣮ݱ ࠓͰ΋όϦόϦݱ໾ʂ ruby/ruby ͷϓϩϑΝΠϦϯάΛ͢Δͱ͖ʹ΋ศར 2009

Slide 38

Slide 38 text

Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ 2009 2013 TracePoint API ͷొ৔ (Ruby) Ruby ࣮ߦதͷ "Πϕϯτ" (ϝιουͷݺͼग़͠ɺϦλʔϯɺ...) ΛऔಘͰ͖ΔAPI ruby-prof ͸͜Εϕʔε

Slide 39

Slide 39 text

Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ ໊ஶ ύϑΥʔϚϯεքͷυϥΰϯϒοΫ େ෯Ξοϓσʔτ͞Εͨୈ2൛ΛങͬͯಡΜͰ͍ͩ͘͞ 2013 Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ 2009 2013

Slide 40

Slide 40 text

Ruby ʹ DTrace αϙʔτ͕ೖΔ ϓϩάϥϜʹ probe ΛຒΊࠐΉ͜ͱͰɺ͋ΔߦΛ࣮ߦͨ͠ճ਺ͳͲΛऔΕΔศརͳ΍ͭ 2013 Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ 2009 2013 2013

Slide 41

Slide 41 text

2014 rb_profile_frames() API ͷొ৔ & Stackprof ͷϦϦʔε "ݱࡏ࣮ߦதͷϝιου" Λฦ͢ C API ͕ Ruby 2.1.0 ʹ௥Ճɺ ͦΕΛ࢖ͬͨϓϩϑΝΠϥ Stackprof ͷొ৔ Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ Ruby ʹ DTrace αϙʔτ͕ೖΔ 2009 2013 2013 2013

Slide 42

Slide 42 text

eBPF ͕࣮༻తʹͳͬͯ͘Δ Linux 4.1 ͋ͨΓͰ eBPF ͷػೳ͕͔ͳΓॆ࣮ͯ͘͠Δ ΧʔωϧϥϯυͰಈ࡞͢ΔϓϩϑΝΠϥΛ࡞Δ͜ͱ͕ݱ࣮తʹ ΧʔωϧϥϯυͰಈ࡞͢Δ = Φʔόʔϔου͕খ͍͞ɻܹ೤ʂ 2015 Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ Ruby ʹ DTrace αϙʔτ͕ೖΔ rb_profile_frames() API ͷొ৔ & Stackprof ͷϦϦʔε 2009 2013 2013 2013 2014

Slide 43

Slide 43 text

Kubernetes ͷຄڵ, out-of-process profiling ΁ͷػӡ ࢭΊΒΕͳ͍ίϯςφԽͷྲྀΕɺ ΋͏গ͠ޙͷ࣌୅ͷ࿩͕ͩɺkubectl plugin ͱͯ͠ಈ࡞Ͱ͖Δ profiler धཁ΋ߴ·Δ 2016 - 2018? Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ Ruby ʹ DTrace αϙʔτ͕ೖΔ rb_profile_frames() API ͷొ৔ & Stackprof ͷϦϦʔε eBPF ͕࣮༻తʹͳͬͯ͘Δ 2009 2013 2013 2013 2014 2015

Slide 44

Slide 44 text

rbspy ͷϦϦʔε Linux ͷ process_vm_readv(2) API Λ࢖ͬͨϓϩϑΝΠϥ ruby ϓϩηεͷ֎͔ΒϝϞϦΛͷ͖ͧݟΔ͜ͱͰಈ࡞͢Δ৽͍͠ΞΠσΞ 2018 Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ Ruby ʹ DTrace αϙʔτ͕ೖΔ rb_profile_frames() API ͷొ৔ & Stackprof ͷϦϦʔε eBPF ͕࣮༻తʹͳͬͯ͘Δ Kubernetes ͷຄڵ, out-of-process profiling ΁ͷػӡ 2009 2013 2013 2013 2014 2015 2016-2018?

Slide 45

Slide 45 text

Continuous Profiling ֓೦ͷొ৔ ʮௐࠪͷҰ؀Ͱ profiler Λࠩ͠ࠐΉʯ͔ΒʮͣͬͱϓϩϑΝΠϧதʯͷ࣌୅΁ Datadog ΋ Continuous Profiler ػೳΛϦϦʔε Observability (o11y) ͷจ຺ͱͷؔ࿈Ͱ΋஫໨ 2020 - 2022? Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ Ruby ʹ DTrace αϙʔτ͕ೖΔ rb_profile_frames() API ͷొ৔ & Stackprof ͷϦϦʔε eBPF ͕࣮༻తʹͳͬͯ͘Δ Kubernetes ͷຄڵ, out-of-process profiling ΁ͷػӡ rbspy ͷϦϦʔε 2009 2013 2013 2013 2014 2015 2016-2019? 2018

Slide 46

Slide 46 text

Observability ֓೦ͷྲྀߦ ௨শ o11y ͳΜ͔͜͏…… ෳࡶͳγεςϜશମΛͪΌΜͱ೺ѲͰ͖ΔΑ͏ʹ͠Α͏ʂ తͳ΍ͭ 2021- 2023? Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ Ruby ʹ DTrace αϙʔτ͕ೖΔ rb_profile_frames() API ͷొ৔ & Stackprof ͷϦϦʔε eBPF ͕࣮༻తʹͳͬͯ͘Δ Kubernetes ͷຄڵ, out-of-process profiling ΁ͷػӡ rbspy ͷϦϦʔε Continuous Profiling ֓೦ͷొ৔ 2009 2013 2013 2013 2014 2015 2016-2019? 2018 2020-2022?

Slide 47

Slide 47 text

rb_profile_thread_frames() API ͷొ৔ rb_profile_frames() API ͷ֦ுɻελοΫΛऔಘ͢Δ Ruby Thread ΛࢦఆͰ͖Δ͜ͱͰ ΑΓਫ਼ࡉͳϓϩϑΝΠϦϯά͕Մೳʹ 2023 Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ Ruby ʹ DTrace αϙʔτ͕ೖΔ rb_profile_frames() API ͷొ৔ & Stackprof ͷϦϦʔε eBPF ͕࣮༻తʹͳͬͯ͘Δ Kubernetes ͷຄڵ, out-of-process profiling ΁ͷػӡ rbspy ͷϦϦʔε Continuous Profiling ֓೦ͷొ৔ Observability ֓೦ͷྲྀߦ 2009 2013 2013 2013 2014 2015 2016-2019? 2018 2020-2022? 2021-2023?

Slide 48

Slide 48 text

Pf2, Vernier ͷϦϦʔε ͓ͦΒ྆͘ํ2023? ࠓ·Ͱʹͳ͔ͬͨػೳΛ࣮૷ͨ͠ϓϩϑΝΠϥ͕Ұؾʹొ৔ 2023 2009 2013 2013 2013 2014 2015 2016-2019? 2018 2020-2022? 2021-2023? 2023 Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ Ruby ʹ DTrace αϙʔτ͕ೖΔ rb_profile_frames() API ͷొ৔ & Stackprof ͷϦϦʔε eBPF ͕࣮༻తʹͳͬͯ͘Δ Kubernetes ͷຄڵ, out-of-process profiling ΁ͷػӡ rbspy ͷϦϦʔε Continuous Profiling ֓೦ͷొ৔ Observability ֓೦ͷྲྀߦ rb_profile_thread_frames() API ͷొ৔

Slide 49

Slide 49 text

Pf2, Vernier ͷϦϦʔε ͓ͦΒ྆͘ํ2023? ࠓ·Ͱʹͳ͔ͬͨػೳΛ࣮૷ͨ͠ϓϩϑΝΠϥ͕Ұؾʹొ৔ 2023 2009 2013 2013 2013 2014 2015 2016-2019? 2018 2020-2022? 2021-2023? 2023 Linux 2.6 Ͱ perf_events (perf) ͕ొ৔ TracePoint API ͷొ৔ Brendan Gregg ઌੜͷ sysperf ຊ͕ग़Δ Ruby ʹ DTrace αϙʔτ͕ೖΔ rb_profile_frames() API ͷొ৔ & Stackprof ͷϦϦʔε eBPF ͕࣮༻తʹͳͬͯ͘Δ Kubernetes ͷຄڵ, out-of-process profiling ΁ͷػӡ rbspy ͷϦϦʔε Continuous Profiling ֓೦ͷొ৔ Observability ֓೦ͷྲྀߦ rb_profile_thread_frames() API ͷొ৔ ϏδϡΞϥΠβ΋ٸ଎ʹॆ࣮͖ͯͨ͠ Speedscope, Firefox Pro fi ler, Perfetto, Chrome Pro fi ler, ...

Slide 50

Slide 50 text

2024೥ɺϓϩϑΝΠϥ͕೤͍ • Continuous Pro fi ling Λ͸͡Ίͱͨ͠ "धཁ" ͕ߴ·͍ͬͯΔ • Linux / Ruby ΁ͷ API ௥ՃΛ௨ͯ͡ "ڙڅ" ΋ߴ·͍ͬͯΔ ೾ʹ৐Γ஗ΕΔͳʂʂʂʂ

Slide 51

Slide 51 text

ࠓ೥ͷRubyKaigi͸Profiler Year✌

Slide 52

Slide 52 text

Day 1 14:50- དྷͯ͘Εʂʂʂ