Slide 1

Slide 1 text

mruby on C# From VM Implementation to Game Scripting

Slide 2

Slide 2 text

mruby is the lightweight implementation of the Ruby language can be linked and embedded within your application.

Slide 3

Slide 3 text

mruby + low-resource environments https://asya81.hatenablog.com/entry/2025/12/23/010736 https://vixion.jp

Slide 4

Slide 4 text

mruby is the lightweight implementation of the Ruby language can be linked and embedded within your application.

Slide 5

Slide 5 text

Application embedded Embedding scripts into standard applications with sufficient resources Application Language Embedded Script Langage k6 Go Js WezTerm Rust Lua NeoVim C Lua Blender C/C++ Python Github Actions C# YAML Godot C++ GDScript (original) Nix C++ Nix lang (original) Terraform Go HCL (original)

Slide 6

Slide 6 text

Μʜʜ

Slide 7

Slide 7 text

Application embedded Embedding scripts into standard applications with sufficient resources Application Language Embedded Script Langage k6 Go Js WezTerm Rust Lua NeoVim C Lua Blender C/C++ Python Github Actions C# YAML Godot C++ GDScript (original) Nix C++ Nix lang (original) Terraform Go HCL (original) Where is mruby?

Slide 8

Slide 8 text

Q. Why isn't muby popular for app embedding?

Slide 9

Slide 9 text

In fact, Ruby excels at DSLs for configurations / sequences E.g. Vagrant E.g. Fastlane E.g. Homebrew

Slide 10

Slide 10 text

App embedded Vagrant — Single language app Weztern — 2-layer language app Lua Rust Ruby Configuration Configuration Core features Core features

Slide 11

Slide 11 text

ٱอాཽࣗ/ @hadashiA • OSS Developer (C#/Unity) • Experience working as an architect for large-scale online games / web services • Interested in performance optimization About Speaker

Slide 12

Slide 12 text

As an OSS author hadashiA/VContainer ˒2.8k The extra fast, minimum code size, GC-free DI (Dependency Injection) library running on Unity Game Engine. hadashiA/VYaml˒450 The extra fast, low memory footprint YAML library for C#, focued on .NET and Unity. hadashiA/VitalRouter˒350 A fast, zero-allocation, in-memory messaging library. Declarative async pipeline with source generator for Unity and .NET. hadashiA/DryDB ˒100 An ultra fast read-only embedded B+Tree based key/ value database, implemented pure C#. ͏͍ͬ͢ hadashiA/Unio ˒200 Unio (short for unity native I/O) is a small utility set of I/O using native memory areas.

Slide 13

Slide 13 text

As an Contributor/Committer (2024) Cysharp/UniTask ˒10.7k Provides an efficient allocation free async/await integration for Unity. Cysharp/ZLogger ˒1.6k Zero Allocation Text/Structured Logger for .NET with StringInterpolation and Source Generator, built on top of a Microsoft.Extensions.Logging. Cysharp/MemoryPack ˒4.4k Zero encoding extreme performance binary serializer for C# and Unity. Cysharp/R3 ˒3.7k The new future of dotnet/reactive and UniRx. ΋ͬ͢΋ͬ͢ Cysharp/Ulid ˒1.6k Fast .NET C# Implementation of ULID for .NET and Unity.

Slide 14

Slide 14 text

Game development is 2-layer •Web services — Users generate content dynamically at runtime. •Games — Massive amounts of content are authored upfront and shipped with the product.

Slide 15

Slide 15 text

Game development is 2-layer.. The primary focus is on "data work," which is not hard-coded into the application system. • Assets w ֆɺԻʜ • Spatial configuration • What exists where in the scene • What happens when player interact with it • Temporal configuration • Cutscene • Character performance • Scenario • Emergent property • Skill effect / Item effect / etc … https://unityroom.com/games/hibana

Slide 16

Slide 16 text

Game development is 2-layer.. • Without affecting the game system • Can be operated by non-programmers • Allows for focusing on content “Data work” is hard

Slide 17

Slide 17 text

Game development is 2-layer.. • GUI • Timeline Editor • Graph Node Editor • Scene Editor • Database / Data sheet / Spreadsheet • Hot-reloadable, Human-readable Text • YAML / TOML / cue / pkl • Bytecode machine “Data work” is hard

Slide 18

Slide 18 text

“Data work” patterns • GUI • Timeline Editor • Graph Node Editor • Scene Editor • Database / Data sheet / Spreadsheet • Hot-reloadable, Human-readable Text • YAML / TOML / cue / pkl • Bytecode machine In some cases, developing internal GUI tools is part of the project itself. Actually, it saves so much effort in tool development! Text is super writer-friendly!

Slide 19

Slide 19 text

Hot-reloadable, Human-readable scripting The primary focus is on "data work," which is not hard-coded into the application system.

Slide 20

Slide 20 text

Hot-reloadable, Human-readable scripting The primary focus is on "data work," which is not hard-coded into the application system.

Slide 21

Slide 21 text

In reality.. 1. Binding boilerplate — C ABI embedding requires extensive bindings 2. Portability burden — Separate mruby build needed per target 3. Async friction — Non-trivial integration with async runtimes Using mruby for application embedding involves a high hurdle..

Slide 22

Slide 22 text

The application is not written in C.. • Apps are written in managed languages that hide platform differences • mruby requires a build step • mruby requires C FFI • And Integrate mruby GC managed memory. • For native binaries, modern devs reach for Rust/Zig by default • Their native types aren't C ABI compatible

Slide 23

Slide 23 text

Applications are not written in C.. In the case of Unity

Slide 24

Slide 24 text

The application is not written in C.. libmruby.dylib libmruby.dll libmruby.dll libmruby.so (x64/arm64) libmruby.dylib (X64/arm64) libmruby.so libmruby.dll In the case of Unity wasm

Slide 25

Slide 25 text

The real value of mruby lies in making custom builds. mruby has a simple build system that is independent of the language toolchain. We need to understand the C build settings for the target platform.

Slide 26

Slide 26 text

The application is not written in C.. C FFI is possible, but writing code that crosses the boundary is cumbersome. Only blittable types can cross the boundary. Risks of memory leaks and segmentation faults when crossing language boundaries.

Slide 27

Slide 27 text

And.. client-side programming = asynchronous

Slide 28

Slide 28 text

And.. client-side programming = asynchronous Programming in "per-frame" steps Like differentiation (ඍ෼) We want to programming a more abstract sequence! Like Integration (ੵ෼) 00:00 (0f) 00:01 (60f) 00:02 (120f) 00:03 (180f) 00:04 (240f) 00:05 (300f) © Nintendo / ͋ͭ·ΕͲ͏Ϳͭͷ৿ ↓ ↓ ↓ ↓ ↓

Slide 29

Slide 29 text

And.. client-side programming = asynchronous 2.Display text with a typewriter effect 3. Suspend 4. Wait for button click 5. Resume 7. Display next text.. 1. 6.

Slide 30

Slide 30 text

1. Portability burden — Separate mruby build needed per target 2. Binding boilerplate — C ABI embedding requires extensive bindings 3. Async friction — Non-trivial integration with async runtimes

Slide 31

Slide 31 text

Q. There is a solution that resolves all of these at once. What is it?

Slide 32

Slide 32 text

A. One of the answers: Re-implementing the mruby VM in a cross-platform, managed, with async ecosystem language

Slide 33

Slide 33 text

Reimplementing virtual machines in managed languages is common • Lua runtime written in C# • nuskey8/Lua-CSharp • MoonSharp • KopiLua • UniLua • Js runtime written in C# • akeit0/okojo • Jint

Slide 34

Slide 34 text

mruby architecture Compile on the host machine, and deploy to the target bytecode machine.

Slide 35

Slide 35 text

mruby architecture Compile on the host machine, and deploy to the target bytecode machine. mruby (original) mruby/c mruby/edge mruby-compiler (original) picoruby/mruby-compiler2

Slide 36

Slide 36 text

hadashiA/MRubyCS mruby mruby/c mruby/edge mruby/cs (NEW)

Slide 37

Slide 37 text

hadashiA/MRubyCS • Pure C# mruby byte-code machine + picoruby/mruby-compiler2 C# bindings. • Designed for seamless integration with C# game engines. • Easily embed Ruby into Unity/.NET

Slide 38

Slide 38 text

hadashiA/MRubyCS $ dotnet add MRubyCS

Slide 39

Slide 39 text

Applications are not written in C.. $ dotnet add MRubyCS mruby runs without the need for a custom build. In the case of Unity

Slide 40

Slide 40 text

The state of C# / .NET • Modern server runtime • Linux first • Build OCI container images directly • Top-level low latency and high throughput • Mature Logging / Tracing / Configuration / Testing • Equipped with HTTP/1, 2, and 3 • Cross-platform SDK for Linux, macOS, and Windows • Games and GUI • NativeAOT is becoming practical for real-world use. • Fast compilation • Highest precision static analysis, code fix, and decompiler High-througput, low-latency modern runtime https://github.com/LesnyRumcajs/grpc_bench

Slide 41

Slide 41 text

hadashiA/MRubyCS • High-throughput GC already built into .C# is available for use. • “Exceptions” also exist in C#. • A highly developed async ecosystem is available. • An integrated, all-in-one toolchain Advantage of C# implementation

Slide 42

Slide 42 text

hadashiA/MRubyCS Define ruby class/method/module in C# class A def with_block(&block) def with_kargs(foo:, *args)

Slide 43

Slide 43 text

hadashiA/MRubyCS Call ruby from C#

Slide 44

Slide 44 text

hadashiA/MRubyCS Ruby exceptions can be caught in C#.

Slide 45

Slide 45 text

Ok, an unofficial mruby runtime is out. Would you use it?

Slide 46

Slide 46 text

How do we get people to trust alternative VM implementations? 1. Compatibility 2. Performance 3. A guide on how to application design

Slide 47

Slide 47 text

How do we get people to trust alternative VM implementations? 1. Compatibility 2. Performance 3. A guide on how to application design

Slide 48

Slide 48 text

1. Compatibility MRubyCS prioritizes Ruby-level compatibility with mruby •Control-flow •Block / Proc • loop / raise / rescue/ ensure / yield / retry / redo / return / break •Built-in classes •Array, Float, Hash, Integer, Nil, Range, Symbol, String, Class, Module, Proc •Time, Random •Fiber •Type, module system •subclass / include / prepend / extend / singleton-class (ಛҟΫϥε) / Class.new / Module.new •instance_eval / class_eval •Method •Keyword arguments / Rest arguments •public / private / protected

Slide 49

Slide 49 text

1. Compatibility Singleton-class

Slide 50

Slide 50 text

1. Compatibility Module include, prepend stack

Slide 51

Slide 51 text

1. Compatibility Exceptions

Slide 52

Slide 52 text

1. Compatibility Method args

Slide 53

Slide 53 text

1. Compatibility Method args

Slide 54

Slide 54 text

1. Compatibility Block return

Slide 55

Slide 55 text

1. Compatibility Builtin class libs

Slide 56

Slide 56 text

1. Compatibility Fiber

Slide 57

Slide 57 text

1. Compatibility Fiber

Slide 58

Slide 58 text

1. Compatibility Strategies for achieving compatibility •The tests for mruby/mruby are written in Ruby. •Therefore, •The first goal as getting the simple test framework of mruby/mruby to work. •Next, port the tests and pass the test cases one by one. •Currently, more than 4,200 test cases are passing.

Slide 59

Slide 59 text

How do we get people to trust alternative VM implementations? 1. Compatibility 2. Performance 3. A guide on how to application design

Slide 60

Slide 60 text

2. Performance MRubyCS is optimized for execution speed •C# VM implementations have pros and cons compared to C versions. •C# advantage •Direct Read/Write to C# Struct Memory Layout •Roslyn Compiler Pipeline •Leveraging .NET JIT Tiered Compilation •C advantage •Opcode dispatch via unsafe direct jumps •1-word mrb_value •Unsafe pointer by default

Slide 61

Slide 61 text

2. Performance C#’s struct Like Swift(,Rust), C# distinguishes value and reference types at the type level. C# value types layout members directly. (With no header, no heap allocation, no write-barrier.)

Slide 62

Slide 62 text

2. Performance C#’s struct Reference types are heap-allocated and store pointers. Like Swift(,Rust), C# distinguishes value and reference types at the type level.

Slide 63

Slide 63 text

2. Performance The unit of data within the mruby VM world is the mruby value.

Slide 64

Slide 64 text

2. Performance The unit of data within the mruby VM world is the mruby value. The size of mrb_value in mruby's default configuration (WORD_BOXING) is 8 bytes. https://github.com/mruby/mruby/blob/master/doc/internal/boxing.md

Slide 65

Slide 65 text

2. Performance C#’s MRubyValue (take 1) — Word boxing = 8bytes 64bit word boxing impementation

Slide 66

Slide 66 text

2. Performance C#’s MRubyValue (take 1) — Word boxing = 8bytes ! A value within the .NET GC heap's address range but not a real address will crash.

Slide 67

Slide 67 text

2. Performance C#’s MRubyValue take 2 — 8bytes + 8bytes Same as word-boxing

Slide 68

Slide 68 text

2. Performance Presym •Ruby symbols are internally unsigned 32-bit integers. •Symbols in source code become integers during parsing. •mruby’s “presym" •Precalculates integer values for well-known symbols (e.g., `:=`, `+`, `-`). •C macros compile symbol names into integers.

Slide 69

Slide 69 text

2. Performance Presym •C# compiler pipeline •Generate well-known symbols at compile time •Like macro

Slide 70

Slide 70 text

2. Performance Read opcode, operand =

Slide 71

Slide 71 text

2. Performance Read opcode, operand Op code (1 byte) Operand (variable- length)

Slide 72

Slide 72 text

2. Performance Read opcode, operand Operand (variable-length) VM is byte sequence reader

Slide 73

Slide 73 text

2. Performance Directly jump to the address specified by the label!!!!!!!! Some advantages of using C

Slide 74

Slide 74 text

2. Performance Some advantages of using C Pointer arithmetic in C does not have bounds checking.

Slide 75

Slide 75 text

2. Performance Almost all languages supporting arrays or sequences insert checks for language-specific errors •Modern programming languages have "slices," which refer to a part of a contiguous memory region. •C# - Span •Go - Slice •Rust - Slice •Swift - Span •…Yeah. We can read the range of memory with zero-copy, but…, it still performs bounds checks.

Slide 76

Slide 76 text

2. Performance References to “byte sequences" in C, lack boundary checks.

Slide 77

Slide 77 text

2. Performance Almost all languages supporting arrays or sequences insert checks for language-specific errors 10 < Length Throw exception if overflow C# x64 (no optimized)

Slide 78

Slide 78 text

2. Performance Read operand — naive implementation If the compiler cannot be certain, boundary checks (branches) will be inserted into everything!!!!!!! See: ʲ.NETʳڥքνΣοΫ͕ফ͑Δύλʔϯू

Slide 79

Slide 79 text

2. Performance Read operand — optimized implementation Match C# type layout to the operand Direct copy (Use “manned pointer” instead of Span)

Slide 80

Slide 80 text

2. Performance C#'s tiered-compilation •Bounds-check, Overflow-check elimination •Inlining •Method de-virtualization •Loop optimizations •Tail-recursion removal •Code layout in memory to optimize processor caches •…and many more

Slide 81

Slide 81 text

2. Performance C#’s Tiered-JIT compilation Tier 0 Tier 1 `GetNovemberDays` will be compiled into the constant 30.

Slide 82

Slide 82 text

2. Performance The topic of JIT optimization budgets No optimized code (x64) Optimized code (x64) C#

Slide 83

Slide 83 text

2. Performance The topic of JIT optimization budgets • Checking the assembly, MRubyCS's initial main loop wasn't optimized at all. • Why??

Slide 84

Slide 84 text

2. Performance MRubyCS main loop Giant switch statement disadvantage.. ྦ

Slide 85

Slide 85 text

2. Performance • JIT does not spend an infinite amount of time on optimization. • The budget for optimization is determined to some extent based on heuristics. • The main loop of MRubyCS is very large, and the JIT exhausts its optimization budget before it can be sufficiently optimized…………. The topic of JIT optimization budgets

Slide 86

Slide 86 text

2. Performance • Reduce the IL size of the main loop. • → Extract cold paths into separate methods and intentionally mark them with [NoInlining]. • → Steadily reduce the number of local variables. • → Stop catching arithmetic overflows with granular try/catch blocks. The topic of JIT optimization budgets Optimize by not optimizing

Slide 87

Slide 87 text

2. Performance Reduce IL code size results: Metric Before Optimization After Optimization Change Stack frame size 1,608 bytes 856 bytes -47% Number of `call` instructions 507 285 -44% MRubyValue accessor calls 105 10 -90% CORINFO_HELP_OVERFLOW 4 0 -100% Number of funclets 8 1 -88% MRrubyValue ctor calls 7 0 -100%

Slide 88

Slide 88 text

2. Performance MRubyCS benchmark results v0.61.3 • It's slightly faster than original mruby, or at least on par. • Still room for research, of course. • JIT via C# IL Emit should also be possible.

Slide 89

Slide 89 text

hadashiA/MRubyCS How do we get people to trust alternative VM implementations? 1. Compatibility 2. Performance 3. A guide on how to application design

Slide 90

Slide 90 text

3. Application design However, it's still not enough….. •We can do everything, which is the problem.. •To begin with, game and client-side programming is characterized by extremely complex control flow.

Slide 91

Slide 91 text

3. Application design Game development tends to get messy. •An event triggers separate events in distant locations, and they form a chain reaction. •“Spaghetti code” is code in which there is no distinction between the controlling side and the controlled side.

Slide 92

Slide 92 text

3. Application design Complex client-side applications require one-way data flow. Games are no exception. •React / preact / Solid / Svelte •Flutter •SwiftUI •Jetpack Compose

Slide 93

Slide 93 text

3. Application design Game development tends to get messy. Spaghetti Unidirectional

Slide 94

Slide 94 text

3. Application design In-memory pub/sub messaging is powerful pattern https://vitalrouter.hadashikick.jp

Slide 95

Slide 95 text

3. Application design In-memory pub/sub messaging is powerful pattern https://vitalrouter.hadashikick.jp Abstract it as a "message" to the application, independent of the input source. By simply spoofing this “message”, we can do anything: esting, auto-play, and demos.

Slide 96

Slide 96 text

3. Application design In-memory pub/sub messaging is powerful pattern https://vitalrouter.hadashikick.jp Publisher Publisher Subscriber Subscriber Interceptor B Interceptor C Interceptor C Interceptor B Interceptor A Interceptor A Exception Handling Logging Filtering The Interceptor pattern becomes applicable. Asynchronous processing is not just about completing something later!! •Cancellation •Error propagation •Throttling / Dropping / Switching •→ In FP terms, it's basically composition. Throttling / Switching

Slide 97

Slide 97 text

3. Application design in-memory messaging — hadashiA/VitalRouter Publish command Async subscriber pipelines

Slide 98

Slide 98 text

mruby to C# messaging in-memory messaging — hadashiA/VitalRouter Ruby C# Publish command Async subscriber pipelines

Slide 99

Slide 99 text

3. Application design in-memory messaging — hadashiA/VitalRouter https://vitalrouter.hadashikick.jp What if we connect mruby to this input...?

Slide 100

Slide 100 text

mruby to C# messaging What is needed to do this •Interoperability of values between mruby and C#. •pause and resume mruby scripts at any time.

Slide 101

Slide 101 text

mruby to C# messaging What is needed to do this •Interoperability of values between mruby and C#. •pause and resume mruby scripts at any time. MRubyCS.Serializer Integrating Fiber with the C# async/await ecosystem

Slide 102

Slide 102 text

MRubyCS.Serializer •Interoperability of values between C# and MRubyValue •Like C# - JSON / C# - YAML / C# - msgpack / C# - protbuf wireformat Deserialize — Ruby to C# Serialize — C# to Ruby

Slide 103

Slide 103 text

Ruby Fiber - C# async/await integration

Slide 104

Slide 104 text

Ruby Fiber - C# async/await integration

Slide 105

Slide 105 text

Ruby Fiber - C# async/await integration

Slide 106

Slide 106 text

3. Application design mruby to C# messaging Type definitions only. No bindings needed at all.

Slide 107

Slide 107 text

2.Display text with a typewriter effect 3. Fiber.yield 4. Wait for button click 5. Fiber#resume 7. Display next text.. 1. 6. 3. Application design mruby to C# messaging

Slide 108

Slide 108 text

VitalRouter.MRuby v2 Case Studies େਖ਼ϊελϧδʔ https://unityroom.com/games/taisho-nostalgie

Slide 109

Slide 109 text

VitalRouter.MRuby v2 Case Studies ώόφ https://unityroom.com/games/taisho-nostalgie

Slide 110

Slide 110 text

VitalRouter.MRuby v2 Case Studies ͿͨͷήʔϜ

Slide 111

Slide 111 text

https://vitalrouter.hadashikick.jp/extensions/mruby/v2

Slide 112

Slide 112 text

Thanks!