Slide 1

Slide 1 text

osyoyu | RubyKaigi 2025 follow up ΋͏ͪΐͬͱ͍͍
 RubyϓϩϑΝΠϥΛ࡞Γ͍ͨ (2025)

Slide 2

Slide 2 text

pp @osyoyu • Daisuke Aritomo / ༗༑ େี • https://github.com/osyoyu

Slide 3

Slide 3 text

ຊ೔ͷݩωλ Profile and benchmark every change
 (RubyKaigi 2025) The depths of profiling Ruby
 (RubyKaigi 2024)

Slide 4

Slide 4 text

ͻ͖͖ͭͮRubyϓϩϑΝΠϥ࡞ͬͯ·͢ • ϓϩϑΝΠϥ • ϓϩάϥϜͷ஗͍ՕॴΛൃݟ͢Δπʔϧ

Slide 5

Slide 5 text

github.com/osyoyu/pf2 • Rubyͷ "۱ʑ·Ͱ" "ਖ਼֬ʹ" ଌఆ
 Ͱ͖Δ͜ͱΛ໨ࢦ͢ • RubyϑϨʔϜ • C (Native) ϑϨʔϜ • GVLͷ઎༗ঢ়گ • Sampling, In-process ͳϓϩϑΝΠϥ ͱͯ͠ઃܭ • ͙͢ΕͨϏδϡΞϥΠζ΋ͦͳ͑Δ 2024ͷεϥΠυ

Slide 6

Slide 6 text

ϏδϡΞϥΠζ εϨου͝ͱͷ༷ࢠͷ τϥοΫ ίʔϧπϦʔ Flamegraph 2024ͷεϥΠυ

Slide 7

Slide 7 text

Optcarrot with C frames 2024ͷεϥΠυ

Slide 8

Slide 8 text

࠷ۙͷ೰Έ

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

ϓϩϑΝΠϥͷಈ෺ͬͯԿʁ

Slide 12

Slide 12 text

೰Έᶃ ϝϞϦΛେྔʹফඅ͢Δ ϝϞϦ͕ͨ͘͞Μ͋Ε͹Ͳ͏ͱ͍͏͜ͱ͸ͳ͍ • 20 MB/s ͙Β͍ͰϝϞϦ͕৯͍ͭͿ͞Ε͍ͯ͘ • ՗྽ • ಉؔ͡਺ͷ৘ใΛෳ਺ճه࿥͠ͳ͍Α͏ʹͨ͠Γ • ಉ͡ελοΫͷ৘ใΛෳ਺ճه࿥͠ͳ͍Α͏ʹͨ͠Γ • ͪΌΜͱςʔϒϧΛ࡞Δͱී௨ʹղܾ͢Δ • ࣮૷ͰϥΫΛͯ͠͸͍͚ͳ͍ • ϝϞϦ͕͋;Εͦ͏ʹͳͬͨΒϑΝΠϧʹॻ͖ग़͢ػೳΛ࣮૷ͯ͠ΈͨΓ

Slide 13

Slide 13 text

೰Έᶄ ϓϩϑΝΠϧͯ͠ΔͱσουϩοΫ͢Δ ͪ͜Β͸࠷ѱ • ͳΜ͔ϋϯά͢Δ • ʮ৽͍͠εϨου͕ੜ·Εͨͱ͖ʹϓϩϑΝΠϧର৅ʹ௥Ճ͢ΔʯػೳΛ
 ༗ޮʹ͢ΔͱɺϥϯμϜʹԿ͔͕σουϩοΫ͢Δ • ࠷ѱ • ^Z kill -9 %1 ͷೖྗ͕͔ͳΓ଎͘ͳͬͨ

Slide 14

Slide 14 text

ෆ҆ఆͳ࣮૷΋Ζ΋Ζ • ͍Ζ͍Ζͳ౎߹Ͱ rb_thread_t ͷߏ଄Λίϐϖ ͍ͯ࣋ͬͯ͠Δ • rb_thread_t ͷߏ଄͕มΘΔͱ౰વյΕΔ • ͍Ζ͍Ζฒྻʹ૸ΔͷͰɺΑ͘෼͔ͬͯͳ͍ ··ࡶʹ Mutex ΛೖΕ·ͬͯͨ͘ • ౰વͷใ͍Λड͚Δ #[repr(C)] struct rb_native_thread { _padding_serial: [c_char; 4], // rb_atomic_t _padding_vm: *mut c_int, // struct rb_vm_struct thread_id: rb_nativethread_id_t, // ... } #[repr(C)] struct rb_thread_struct { _padding_lt_node: [c_char; 16], // struct ccan_list_node _padding_self: VALUE, _padding_ractor: *mut c_int, // rb_ractor_t _padding_vm: *mut c_int, // rb_vm_t nt: *mut rb_native_thread, // ...

Slide 15

Slide 15 text

ϦϥΠτ & ͍ͭͰʹRust͔ΒC΁ • 2ճ໨ʢ3ճ໨ʁʣͷ࣮૷ͳͷͰɺલΑΓ໌Β͔ʹચ࿅͞Εͨ • MutexΛ΄ͱΜͲ΍Ίͯɺඞཁͳͱ͜Ζ͚ͩϩοΫϑϦʔͳߏ଄ʹ • ͋ͱ͸Vec૬౰Λ࢖͏ͷΛ΍ΊΒΕͨΓ • ଟ෼଎͍ϦϯάόοϑΝΛॻ͚ͨΓ • ଞʹ΋஍ຯʹ͏Ε͍͜͠ͱ͕ • ruby.h Ͱఏڙ͞ΕΔϚΫϩ͕࢖͑Δʂ • ഑෍͕ϥΫʂ • ίϯύΠϧ࣌ʹݕ஌Ͱ͖Δϛε͸ݮ͚ͬͨͲɺͲ͏ͤ΄΅શ෦unsafeͩͬͨ͠

Slide 16

Slide 16 text

೰Έᶅ ͍͍ϓϩϑΝΠϧର৅͕खݩʹͳ͍ MandelbrotͱRailsΛϓϩϑΝΠϧ͢Δͷ๞͖ͨ

Slide 17

Slide 17 text

೰Έᶅ ͍͍ϓϩϑΝΠϧର৅͕खݩʹͳ͍ MandelbrotͱRailsΛϓϩϑΝΠϧ͢Δͷ๞͖ͨ • ࠓ೔͓΋͠Ζͦ͏ͳλʔήοτͷ࿩Λ2ͭฉ͚ͨͷͰղܾ͠·ͨ͠

Slide 18

Slide 18 text

೰Έᶆ ࢖ΘΕͯͳ͍ ͦ͏ͩͶ • લड़ͷΑ͏ʹɺ҆ఆͯ͠ͳ͍ػೳ͕ଟ͍͠…… • ͪΌΜͱએ఻ͯ͠ͳ͍͠……

Slide 19

Slide 19 text

ͦΜͳ͜ΜͳͰɺͦΖͦΖ৽όʔδϣϯΛग़͍ͨ͠ • શ໘ϦϥΠτͨ͜͠ͱͰ͔ͳΓ҆ఆͨ͠ͷͰɺϦϦʔε͍ͨ͠Ͱ͢Ͷ • master ʹ͔͠ͳ͍मਖ਼͕େྔʹ͋Δ

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

࠷ۙͷ࿩୊
 Brendan Greggઌੜʹձͬͨ • ύϑΥʔϚϯε෼ੳͷେՈ • Flame GraphΛൃ໌ͨ͠ਓ • ଞʹ΋େྔͷख๏Λ։ൃ͍ͯ͠Δ • ϓϩϑΝΠϥʹ͍ͭͯ
 ͍ΖΜͳΞυόΠεΛ΋Βͬͨ

Slide 23

Slide 23 text

Brendan Gregg ઌੜͷݴ༿ • Կ΋ى͍ͬͯ͜ͳͯ͘΋ɺ1೔ʹ1౓ɺ1෼ఔ౓ͷϓϩϑΝΠϧΛ
 औ͓ͬͯ͘ͱ͍͍ • Կ͔͕ى͖ͨͱ͖ɺ͍͔ͭΒى͖͔ͨͷௐࠪʹ༗༻ • ͳΔ΄Ͳʙ • ͲͪΒ͔ͱ͍͏ͱAPM͕ఏڙ͢Δػೳͳؾ͕͢Δ͕ • ࡶʹͦΕΛ΍ΕΔπʔϧ͕͋ͬͯ΋͍͍͔΋

Slide 24

Slide 24 text

ʮKernelͷϑϨʔϜ΋ݟ͑ͨ΄͏͕͍͍ʯ • Pf2͸RubyϑϨʔϜͱCϑϨʔϜΛ
 ߹੒ͯ͠දࣔ͢Δ͜ͱ͕Ͱ͖Δ • ͦ͜ʹKernelͷϑϨʔϜ (syscallͷઌ)
 ΋ग़ͨ΄͏͕͍͍Αɺͱ͍͏࿩ • ͪΐͬͱ͏Ε͕͠͞·ͩ෼͔ͬͯͳ͍ • eBPFΛ࢖Θͳ͍ͱऔΕͳͦ͞͏Ͱ
 ·ͩख͕ಈ͍ͯͳ͍ 2024ͷεϥΠυ

Slide 25

Slide 25 text

Flame Scope ࠷ۙͷൃ໌ ࣮૷ͨ͠ʂ

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

΍Δ͔ʙ……

Slide 28

Slide 28 text

ϓϩϑΝΠϥྖҬͰ͋Γͦ͏ͳ՝୊੔ཧ • ௿ෛՙͳ wall-time profiling • ຊདྷ͸ I/O Λ࢝ΊͨॠؒͱऴΘͬͨॠ͚ؒͩه࿥ͯ͠ɺ
 ޙ͔Βܭࢉ͢Ε͹͍͍͕ɺࠓͷ Ruby ϓϩϑΝΠϥͷଟ͘͸
 "શεϨουʹఆظతʹࠓ΍ͬͯΔ͜ͱΛ໰͍߹Θͤ" ͍ͯ͠Δ • JIT͞Εͨؔ਺͔Ͳ͏͔஌Γ͍ͨ • YJIT͸YJITͰ perf map Λग़ྗͰ͖Δ͕ɺະJITϝιου͸ݟ͑ͳ͍

Slide 29

Slide 29 text

ϓϩϑΝΠϥྖҬͰ͋Γͦ͏ͳ՝୊੔ཧ • LLM-readable flamegraph • ը૾ͱͯ͠VLMʹಡ·ͤͯ΋
 Կ΋ಡΈऔΕ͍ͯͳ͍

Slide 30

Slide 30 text

ϓϩϑΝΠϥྖҬͰ͋Γͦ͏ͳ՝୊੔ཧ • perf, eBPF ରԠ • ී௨ʹ perf ͰϓϩϑΝΠϥऔͬͨͱ͖ʹRubyϑϨʔϜ΋ݟ͍ͨ • ͑ʙʙ ຊ౰ʹʁʁʁ • Linux Ҏ֎Ͳ͏͢Μͷʁ • ͓ͩͬͯલΒ Mac Ͱ։ൃͯ͠ͳ͍ʁ • WSL ΍ Docker (Mac) Ͱ͸·ͱ΋ʹperf࢖͑ͳ͍͚Ͳେৎ෉Ͱ͔͢ʁ • ͳͷͰ͋Μ·Γڵຯͳ͍

Slide 31

Slide 31 text

େRactor࣌୅

Slide 32

Slide 32 text

େ՝୊ɿRactor ରԠ • rb_profile_frames() ͸શવ Ractor ͷ͜ͱΛڭ͑ͯ͘Εͳ͍ • ϓϩηε֎Ͱ૸ͬͯ ruby ϓϩηεͷϝϞϦΛղऍ͢ΔλΠϓͷ
 ϓϩϑΝΠϥͰ Ractor ͷ͜ͱΛ஌Δͷ΋͔ͳΓେมͦ͏ • M:N ʹͳΔͷ͕͔ͳΓΩπ͍

Slide 33

Slide 33 text

Go΍OpenJDK͔ΒֶͿ • ॲཧܥʹΑͬͯͱΕΔϓϩϑΝΠϧखஈ͸େ͖͘ҟͳͬͯ͘Δ • ϥϯλΠϜͷͳ͍ݴޠͰ͸ϓϩϑΝΠϥΛ֎෦ʹஔ͔͟ΔΛಘͳ͍ • ҰํɺΘΓͱෳࡶͳϥϯλΠϜΛ΋͍ͬͯΔݴޠ͸
 ϓϩϑΝΠϥΛ಺෦ʹ͕࣋ͪͪ • Go: runtime/pprof ɺJava: JFR, AsyncGetCallTrace() • ಛʹGo͸Goroutineͷ࢓૊Έ͕Ractorʹ͍ۙͷͰ
 ϓϩϑΝΠϥͷߏ଄΋ࢀߟʹͳΔͩΖ͏

Slide 34

Slide 34 text

CRuby ͷதʹϓϩϑΝΠϥΛೖΕͯ͠·͏͔? ext/profiler ৽ઃ • ݁ہͷͱ͜ΖɺϓϩϑΝΠϥ͸࣮ߦঢ়ଶΛΩϟϓνϟ͢Δ΋ͷ • ϥϯλΠϜ͕͋ΔͳΒɺϥϯλΠϜ಺ʹͦͷػೳ͕͋Δͷ͕߹ཧత? • ֎෦ϓϩηεʹ͢Δ͔Βෆ҆ఆͳ struct ίϐϖ͕ඞཁʹͳΔ • ext/profiler ͱ͔ʹೖΕͯ͠·͏ͷ͕͍͍ͷ͔ͳʔ • ࢼ࡞ͯ͠Έͨͱ͜ΖɺGET_VM() ΋ rb_thread_t ΋࢖͑ͯศར

Slide 35

Slide 35 text

ࢼ࡞த: Ractor / M:N ରԠ in ext/profiler • Ractor Ҏલ͸ʮ1ϓϩηεͰಉ࣌ʹCPUΛ࢖͑Δͷ͸
 1εϨου͚ͩʯͱ͍͏ԾఆΛ͓͚ͨ
 ※ GVLΛख์͢C֦ு͸͜͜Ͱ͸ແࢹ • Ractor Ͱ͸ͦͷԾఆ่͕ΕΔ • Χʔωϧ͔Βʮϓϩηε͕CPU timeΛ10 ms࢖ͬͨʯ௨஌Λ
 ड͚औͬͨͱ͖ɺͦΕ͕ͲͷΧʔωϧεϨουɾRuby ThreadʹΑΔ΋ͷ͔
 ࣝผͰ͖ͳ͍ͱ͍͚ͳ͍ େ Ractor ࣌୅

Slide 36

Slide 36 text

ࢼ࡞த: Ractor / M:N ରԠ in ext/profiler • Linux Ͱ͸ timer_create(CLOCK_THREAD_CPUTIME_ID) Λ࢖͏͜ͱͰ
 ΧʔωϧεϨου୯ҐͰফඅͨ͠CPU࣌ؒΛτϥοΩϯάͰ͖Δ • ͔͠͠ΧʔωϧεϨουͱRuby ThreadͷରԠ͸C APIͰެ։͞Ε͍ͯͳ͍ Kernel Thread 1 Ruby Thread KT 2 RT RT RT KT 2 ͕͋ΔॠؒʹͲͷ RT Λ࣮ߦ͍͔ͯͨ͠ ϓϩϑΝΠϥͱͯ͠͸஌Γ͍ͨ

Slide 37

Slide 37 text

͜Μͳײ͡ʹͳΓͦ͏? • Native Thread ͝ͱʹΧʔωϧλΠϚʔΛઃఆ͢Δ • ϓϩϑΝΠϥ͕ىಈͨ͠ͱ͖ͱɺͦͷޙεϨου͕৽نʹىಈͨ͠ͱ͖ʹ
 ઃఆ • thread_sched_switch Ͱ Ractor ؒͷεϨουͷҠಈΛه࿥͢Δ • ͱࢥ͚ͬͨͲɺγάφϧϋϯυϥ಺Ͱ GET_RACTOR() ͍͍ͯ͠ͳΒ
 ͦΕͰ΋͍͍

Slide 38

Slide 38 text

ͱ͍͏Θ͚Ͱ • ແ೉ͳվળΛਐΊ͍ͯ·͢ • ͦΕ͸ͦ͏ͱ Ruby 3.5 ʹ޲͚ͯɺCRuby ಺ʹϓϩϑΝΠϥΛઃஔ͢Δ
 Ξϓϩʔν͕ݱ࣮త͔͔֬ΊΑ͏ͱ͍ͯ͠·͢