The path to memory reduction
in RBS
Money Forward Tech LT大会 vol.2 at Fukuoka
Oct. 15th 2024
Slide 2
Slide 2 text
pp self
● Pocke
● Work for Money Forward
● Ruby committer (RBS maintainer)
● Rails application developer
● From Okayama
○ My favorite ramen in Okayama🍜 →→
Slide 3
Slide 3 text
Agenda
The main theme is reducing memory of RBS and Steep.
● Why do I need to reduce memory usage of RBS
● Memory Profiling for Ruby
● Future plan
Slide 4
Slide 4 text
Glossary
● RBS
○ A library for static typing of Ruby
○ It provides RBS language, tools, and so on
● Steep
○ A static type checker for Ruby
○ It uses RBS
○ It provides CLI tools and LSP server
Slide 5
Slide 5 text
Why do I need to reduce memory usage
of RBS?
Slide 6
Slide 6 text
Why is the memory improvement necessary
Steep uses too much memory because:
● Steep makes resident processes because it works as LSP
server
● Steep makes many processes
○ For number of projects using Steep
○ For number of CPUs because Steep launches workers for
parallelization
○ total_memory = projects.size * CPUs.size * memory_per_process
Slide 7
Slide 7 text
Why is the memory improvement necessary
1 Steep worker process consume ~1.5GB memory in a middle size
Rails application
e.g. 8 core * 5 project * 1.5GB/proc
= 60GB
We need to decrease the memory usage in order for Steep to be
widely used.
Slide 8
Slide 8 text
Memory Profiling for Ruby
Slide 9
Slide 9 text
Measure. Don't second guess
Profiling is important to clarify the bottleneck
Slide 10
Slide 10 text
SamSaffron/memory_profiler
Ruby has memory_profiler gem.
require 'memory_profiler'
arr = []
r = MemoryProfiler.report do
Object.new # (1)
arr.push Object.new # (2)
end
r.pretty_print
Allocated:
(1), (2)
Retained:
(2)
Slide 11
Slide 11 text
No content
Slide 12
Slide 12 text
It's really useful gem, but…
It is not enough for Steep because:
● I want to reduce "peak" memory usage of Steep
● It is not efficient to profile peak memory usage
Slide 13
Slide 13 text
Allocated Memory by memory profiler
● It traces all allocated memory/objects during profiling
● Pros: It's helpful to find a execution time bottleneck caused by
memory allocation
● Cons: It's not helpful to find the cause of the peak memory usage
○ Too noisy
○ Example: Steepのメモリ使用量を改善するつもりが、実行速度の改善をして
いた - Money Forward Developers Blog
https://moneyforward-dev.jp/entry/2024/07/29/improve-steep-performanc
e
Slide 14
Slide 14 text
Retained Memory by memory profiler
● It traces all retained memory/objects when the profiling is
finished
● Pros: It's helpful to find a memory leak
● Cons: It's not helpful to find the cause of the peak memory
usage
○ We need to stop profiling on the peak, but the peak is not obvious
Slide 15
Slide 15 text
New memory profiler: Majo🧙
I created a new memory profiler for Ruby to profile peak memory
usage.
https://github.com/pocke/majo
Slide 16
Slide 16 text
The strategy of Majo
● I supposed peak memory usage is approximated as memory
usage of long-lived objects
● Majo collects allocation info only for long-lived objects
○ It introduces object lifetime by how many times the object survived
GC
Slide 17
Slide 17 text
How Majo cast a spell on Ruby
● Ruby provides hooks on Ruby object allocation and `free`
● Use TracePoint events
○ `RUBY_INTERNAL_EVENT_NEWOBJ`
○ `RUBY_INTERNAL_EVENT_FREEOBJ`
Slide 18
Slide 18 text
CSV format output
Majo supports CSV format. It's really useful with Spreadsheet
https://docs.google.com/spreadsheets/d/1TnlnLXQTnuDfB3Bhw
0sNp9y2iZObqpkKVeqE--eAdlk/edit?gid=331894152#gid=3318941
52
Slide 19
Slide 19 text
CSV format output on a spreadsheet
Slide 20
Slide 20 text
The result by Majo
● Reduce Array allocation during parsing
○ https://github.com/ruby/rbs/pull/1950
● Reduce Hash allocation during parsing
○ 不要な処理が実行速度を速くする謎を追う - Money Forward
Developers Blog
https://moneyforward-dev.jp/entry/2024/09/26/removing-steps-make
s-it-slower
○ I will introduce this patch for the next Ruby version
Slide 21
Slide 21 text
Future plan
Slide 22
Slide 22 text
Future plan
I will change Steep's process management more Copy on Write
(CoW) friendly.
Slide 23
Slide 23 text
What's Copy on Write
It's a technique to wait Copying before Writing
This slides focus on CoW for memory management by *nix on
`fork`.
Note: `fork` is an API to duplicate a process on *nix OS🍴
Slide 24
Slide 24 text
# Memory
[1, 2, 3]
Copy on Write Example (1)
# Process A
x = [1, 2, 3]
if fork
x.push(42)
p x
else
p x
end
Slide 25
Slide 25 text
# Memory
[1, 2, 3]
Copy on Write Example (2)
# Process A
x = [1, 2, 3]
if fork
x.push(42)
p x
else
p x
end
# Process A'
x = [1, 2, 3]
if fork
x.push(42)
p x
else
p x
end
Slide 26
Slide 26 text
# Memory
[1, 2, 3, 42]
# Copying!
[1, 2, 3]
Copy on Write Example (3)
# Process A
x = [1, 2, 3]
if fork
x.push(42)
p x
else
p x
end
# Process A'
x = [1, 2, 3]
if fork
x.push(42)
p x
else
p x
end
Slide 27
Slide 27 text
The current process management of Steep
Steep LSP uses Master-Workers structure.
Steep
Master
Steep
Worker 3
Steep
Worker 2
Steep
Worker 1
fork
fork
fork
● Master
○ Communicate the LSP
client and workers
● Workers
○ Process LSP features
○ Type checking,
complement, hover, …
Slide 28
Slide 28 text
The current process management of Steep
All workers have different RBS::Environment.
Steep
Master
Steep
Worker 3
Steep
Worker 2
Steep
Worker 1
fork
fork
fork
RBS::Env
1
RBS::Env
3
RBS::Env
2
Slide 29
Slide 29 text
Solution: Fork-Worker and Reforking
A CoW friendly process management structure for the Master-Worker
model
● In the traditional Master-Worker model, workers are forked from
the master process
● In Fork-Worker, workers are forked from a worker process
I borrowed this idea from puma and pitchfork (HTTP server for Ruby)
https://github.com/puma/puma/blob/master/docs/fork_worker.md
Slide 30
Slide 30 text
Fork-Worker
All workers share the same memory
Steep
Master
Steep
Worker 3
Steep
Worker 2
Steep
Worker 1
fork
fork
fork
RBS::Env
Slide 31
Slide 31 text
Reforking
Restart workers after a while
Steep
Master
Steep
Worker 3
Steep
Worker 2
Steep
Worker 1
fork
kill
kill
RBS::Env
Slide 32
Slide 32 text
Reforking
Restart workers after a while
Steep
Master
Steep
Worker 1
fork
RBS::Env
Slide 33
Slide 33 text
Reforking
Restart workers after a while
Steep
Master
Steep
Worker 3
Steep
Worker 2
Steep
Worker 1
fork
refork
refork
RBS::Env
Slide 34
Slide 34 text
Conclusion
Slide 35
Slide 35 text
Conclusion
● New memory profiler: Majo
○ It collects long-lived object allocations
● Steep will have more CoW-friendly structure
○ Fork worker and Reforking
Thanks for listening!