student at NII (National Institute of Informatics). • Studying the application of automata theory and formal languages. • Recently interested in nominal sets. • Currently on the job market (!) 2
Ruby (prism/parse.c) Handling escape sequences before compiling Regexp (re.c) Fast checking for collecting named captures (prism/regexp.c) Parsing with Onigmo (regparse.c) 1 2 3 4
are for the Prism case. • Regexp preprocessing stages depend on Ruby parser and how a Regexp value created. • parse.y also has its own preprocessing stage, but it uses Onigmo for collecting named captures. • When Regexp.new(...) is called, preprocessing is started from stage 3. 24
processing is highly problematic. There are countless bugs. • e.g., 1. `/\c?/ =~ "\x7F"`, but `Regexp.new('\c?') !~ "\x7F"`. 2. `/(?<\x61>x)/ =~ "x"` raises IndexError. 3. `/[]]/` is an error in Prism, but it works in parse.y. 27
(w/ YJIT) YJIT / no JIT NarakuRuby::DFA.match? (w/ ZJIT) ;+*5OP+*5 MJUFSBM JT JT YGBTUFS JT YGBTUFS BMUFSOBUJPO JT JT YGBTUFS JT YGBTUFS SFQFUJUJPO HSFFEZ JT JT YGBTUFS JT YGBTUFS SFQFUJUJPO BNCJHVPVT JT JT YGBTUFS JT YGBTUFS DIBS@DMBTT JT LJT YGBTUFS JT YGBTUFS BODIPS JT LJT YGBTUFS JT YGBTUFS Comparing with [YZ]JIT
/BSBLV0OJHNP MJUFSBM LJT JT YTMPXFS BMUFSOBUJPO LJT JT YTMPXFS SFQFUJUJPOHSFFEZ LJT JT YTMPXFS SFQFUJUJPOBNCJHVPVT JT JT YTMPXFS DIBS@DMBTT LJT LJT YTMPXFS BODIPS LJT LJT YTMPXFS Comparing with Onigmo (YJIT enabled)
Over-optimized NarakuRubyDFA.match? (OO Naraku) 00/BSBLV0OJHNP MJUFSBM LJT JT LJT YTMPXF SFQFUJUJPO HSFFEZ LJT JT LJT YTMPXFS SFQFUJUJPO BNCJHVPVT JT JT LJT YGBTUFS with YJIT
engine fast as Onigmo. • However, it causes a maintainability issue. • Perhaps, my program did not obtain the full ZJIT power. I try to learn the ZJIT architecture.