Slide 1

Slide 1 text

Trying to Make Ruby’s Parser Available as a Gem Shun Hiraoka Fukuoka RubyistKaigi 04 2024-09-07

Slide 2

Slide 2 text

About me S.H.(Shun Hiraoka) Hamada.rb ESM, Inc GitHub: S-H-GAMELINKS Mastodon: [email protected]

Slide 3

Slide 3 text

Fukuoka RubyistKaigi 04

Slide 4

Slide 4 text

Daily Activities

Slide 5

Slide 5 text

I often write code with Ruby and Ruby’s Parser

Slide 6

Slide 6 text

I’ll be talking about Ruby’s Parser

Slide 7

Slide 7 text

Agenda Motivation 1. Gem 2. Implementation 3. Future Prospects and Issues 4.

Slide 8

Slide 8 text

(1) Motivation

Slide 9

Slide 9 text

Want to use Ruby Parser The Ruby Parser is now available as a Universal Parser

Slide 10

Slide 10 text

But… Few cases use Universal Parser Interested if it works as a Universal Parser

Slide 11

Slide 11 text

Example yui-knk/ruby-parser

Slide 12

Slide 12 text

yui-knk/ruby-parser It may be the only gem using Universal Parser Only parsing specific code Not converting and returning the AST as Ruby objects

Slide 13

Slide 13 text

Inconveniences Need to modify the code and rebuild to try various Ruby scripts Unable to work with the AST as Ruby objects It’s a bit cumbersome to try parsing Ruby code

Slide 14

Slide 14 text

Current status Errors occur because some function symbols cannot be found undefined symbol: rb_xmalloc_mul_add

Slide 15

Slide 15 text

Want a way to verify the Universal Parser

Slide 16

Slide 16 text

If it doesn’t exist, make it I’ll better understand Ruby’s Parser by building it Above all, it sounds interesting

Slide 17

Slide 17 text

Consider the specs

Slide 18

Slide 18 text

Easy to test code Prism.parse Code can be passed as a string The parsed result can be accepted as an AST from Prism Prism.parse('1 + 1')

Slide 19

Slide 19 text

Doing the same would be good

Slide 20

Slide 20 text

Read the implementation of yui-knk/ruby- parser

Slide 21

Slide 21 text

Implementation Using an extracted Ruby Parser Building a custom adapter for the Universal Parser void parser_config_initialize(rb_parser_config_t *config) { config->calloc = calloc; config->malloc = malloc; config->alloc = malloc; config->alloc_n = alloc_n; config->sized_xfree = sized_xfree; // And other adapter function set }

Slide 22

Slide 22 text

Is there an easier way? rb_parser_params_new function Returns the Universal Parser adapter used in Ruby rb_parser_t * rb_parser_params_new(void) { return rb_ruby_parser_new(&rb_global_parser_config); }

Slide 23

Slide 23 text

How to convert the AST? It might be a good idea to return the AST as a Prism-like class I haven’t planned for it to be used in any future products Then, using Array and Hash should be fine for now

Slide 24

Slide 24 text

Implementation plan Import and use only the Ruby’s Parser Reuse the adapter that Ruby provides Convert the AST into an Array and Hash

Slide 25

Slide 25 text

(2) Gem

Slide 26

Slide 26 text

Kanayago

Slide 27

Slide 27 text

Origin of the name Inspired by Kanayago, the god of ironworking The process of converting the AST into an Array or Hash reminded me of refining and processing iron ore

Slide 28

Slide 28 text

About Kanayago Works on Ruby 3.4.0-dev Enabling the Universal Parser is also necessary Using Universal Parser, fully Ruby-compatible But only what’s convertible to Hash and Array

Slide 29

Slide 29 text

Support AST nodes Literal Object e.g. Integer, Float, Symbol, String… Local variables Instance variables Class and Method definition if and unless expression

Slide 30

Slide 30 text

Example Code Takes a String as an argument, similar to Prism Returns the AST as a Hash and Array result = Kanayago.parse('1 + 1') body = result[:NODE_SCOPE][:body] args = body[:NODE_OPCALL][:args] args[:NODE_LIST][0] # => { NODE_INTEGER: 1 }

Slide 31

Slide 31 text

(3) Implementation

Slide 32

Slide 32 text

Implementation Takes source code as a string through the argument Passes the accepted code to the Universal Parser and converts it into an AST Convert the AST into a Hash and Array

Slide 33

Slide 33 text

Kanayago.parse Kanayago.parse accepts the source code and passes it to kanayago_parse function module Kanayago def self.parse(source) kanayago_parse(source) end end

Slide 34

Slide 34 text

kanayago_parse Pass the code to the Universal Parser and accept the AST static VALUE kanayago_parse(VALUE self, VALUE source) { struct ruby_parser *parser; rb_parser_t *parser_params; // Set Ruby Parser struct... VALUE vast = rb_parser_compile_string(vparser, "main", source, 0); rb_ast_t *ast = rb_ruby_ast_data_get(vast); return ast_to_hash(ast->body.root); }

Slide 35

Slide 35 text

ast_to_hash Convert the AST into a Hash and Array static VALUE ast_to_hash(const NODE *node) { enum node_type type; // Check and Set node type... switch (type) { // And Other AST NODE case's... case NODE_INTEGER: case NODE_FLOAT: case NODE_RATIONAL: case NODE_IMAGINARY: case NODE_STR: case NODE_SYM: return node_literal_to_hash(node); default: return Qfalse; } }

Slide 36

Slide 36 text

Processing flow Ruby Parser C Ruby Universal Parser ast_to_hash kanayago_parse Kanayago.parse Universal Parser ast_to_hash kanayago_parse Kanayago.parse Given Ruby Code with String Pass to Ruby Code Return AbstractSyntaxTree Given AbstractSyntaxTree Return Converted Hash Pass to Hash

Slide 37

Slide 37 text

A few patches applied Wrapping a function Exporting a struct

Slide 38

Slide 38 text

Wrapping a function Wrapping some functions was necessary to prevent errors caused by undefined symbols static void * xmalloc_mul_add(size_t x, size_t y, size_t z) { return rb_xmalloc_mul_add(x, y, z); }

Slide 39

Slide 39 text

Exporting a struct It was required because of the use of the rb_parser_params_new function struct ruby_parser { rb_parser_t *parser_params; enum lex_type type; union { struct lex_pointer_string lex_str; struct { VALUE file; } lex_io; struct { VALUE ary; } lex_array; } data; };

Slide 40

Slide 40 text

(4)Future Prospects and Issues

Slide 41

Slide 41 text

I’ll nurture it like a bonsai

Slide 42

Slide 42 text

Issues Support for more AST nodes Avoid patching Ruby’s Parser Want to remove dependency on the Universal Parser adapter provided by Ruby

Slide 43

Slide 43 text

Support for more AST Expand AST support Kanayago.parse(<<~CODE) {name: :kanayago, method: :parse} in {name: } CODE #=> {:NODE_SCOPE=>{:args=>nil, :body=>false}}

Slide 44

Slide 44 text

Avoid patching I want to submit the function wrap as a patch to Ruby core Considering better approaches to struct export

Slide 45

Slide 45 text

Adapter dependency SEGV occurs when the adapter is modified So, I want to build a custom adapter

Slide 46

Slide 46 text

Conclusion With effort, Ruby’s Parser can be used as a Universal Parser And you can provide it as a gem

Slide 47

Slide 47 text

Finally We can use Ruby’s Parser as a Universal Parser Let’s all make more use of Ruby’s Parser!