Slide 1

Slide 1 text

Let's use LLMs from Ruby Leaner Technologies, Inc. Kokuyou ( 黒曜) / Shunsuke Mori @kokuyouwind ~ Refine RBS types using LLM ~

Slide 2

Slide 2 text

Talk: Japanese Slides: English (+ Japanese) トーク: ⽇本語 スライド: ⽇英併記

Slide 3

Slide 3 text

$ whoami Kokuyou ( 黒曜) / Shunsuke Mori Twitter or X: @kokuyouwind Work at: Leaner Technologies, Inc. Platinum / Drinkup Sponsor Day 2: Leaner YAKINIKU Party Procurement Domain BtoB Startup 調達領域のBtoB スタートアップ

Slide 4

Slide 4 text

We're Hiring!!! エンジニア絶賛採⽤中! Feel free to talk/DM us! 気軽に話しかけたり DM してください! https://careers.leaner.co.jp/

Slide 5

Slide 5 text

I'll talk about one that has no relation to my job at all. 業務とは全然関係のない話をします

Slide 6

Slide 6 text

Lightning Talk at RubyKaigi 2023 RubyKaigi 2023 でのライトニングトーク https://rubykaigi.org/2023/presentations/lt/

Slide 7

Slide 7 text

Matz Keynote at RubyKaigi 2023 RubyKaigi 2023 でのMatz の基調講演 https://speakerdeck.com/matz/30-years-of-ruby?slide=79

Slide 8

Slide 8 text

Oh no, Matz presented on the same topic I was planning to cover! https://twitter.com/kokuyouwind/status/1656488513666453505

Slide 9

Slide 9 text

This Year... 今年は…

Slide 10

Slide 10 text

😌

Slide 11

Slide 11 text

Large Language Model (LLM) Machine learning models trained on large amounts of text data. OpenAI ChatGPT 4 Anthropic Claude 3 Microsoft GitHub Copilot ⼤量のテキストデータを使ってトレーニングされた機械学習モデル ⼤規模⾔語モデル

Slide 12

Slide 12 text

LLM Example LLM の例 https://claude.ai/

Slide 13

Slide 13 text

Evolution of LLM since last year 昨年からのLLM の進化 Price Reduction 💰 値下げ Sentence Length Expansion 扱える⽂章⻑の拡⼤ x 1/8 📚 x 480 Improvement of coding skills コーディング能⼒の向上 + 23.2 pt (OpenAI gpt-3.5-turbo-0301 to Anthropic Claude3 Haiku) (OpenAI gpt-3.5-turbo-0301 to Google Gemini 1.5 Pro 2M) (GPT-4 to GPT-4o, ) HumanEval 4096 → 2M $2.0 → $.025 / 1MTok 67.0% → 90.2% 📝

Slide 14

Slide 14 text

🤔 Now that LLM's capabilities have increased so much, can we even guess RBS types for the entire project? これだけLLM の能⼒が上がったなら、 プロジェクト全体のRBS の型推測もワンチャンいけるのでは… !?

Slide 15

Slide 15 text

🤨 Now that LLM's capabilities have increased so much, can we even guess RBS types for the entire project? これだけLLM の能⼒が上がったなら、 プロジェクト全体のRBS の型推測もワンチャンいけるのでは… !?

Slide 16

Slide 16 text

😲 Now that LLM's capabilities have increased so much, can we even guess RBS types for the entire project? Goose

Slide 17

Slide 17 text

I made as a tool to guess RBS types.

Slide 18

Slide 18 text

Duck

Slide 19

Slide 19 text

quacking like geese (Duck Goose Typing) Duck

Slide 20

Slide 20 text

I talk about the creation of RBS Goose, a tool to guess RBS types from Ruby. Ruby コードからRBS 型を推測するツール、 RBS Goose を作った話をします

Slide 21

Slide 21 text

Headline ⽬次 Purpose of creating RBS Goose How RBS Goose works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ

Slide 22

Slide 22 text

Headline ⽬次 Purpose of creating RBS Goose How RBS Goose works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ

Slide 23

Slide 23 text

I thought it would be interesting if I could guess the type with LLM. LLM で型が推測できたら⾯⽩そうだったから。

Slide 24

Slide 24 text

Fin. 完

Slide 25

Slide 25 text

Let me give you a little more background. もうすこし背景の話をします。

Slide 26

Slide 26 text

What is RBS? RBS とは? Language for defining Ruby type structures Ruby の型構造を定義するための⾔語 class Person attr_reader :name def initialize(name:) @name = name end def name=(name) @name = name end end 1 2 3 4 5 6 7 8 9 10 11 class Person @name: String attr_reader name: String def initialize: (name: String) -> void def name=: (String name) -> void end 1 2 3 4 5 6 7 8 9 person.rb person.rbs https://github.com/ruby/rbs

Slide 27

Slide 27 text

Why RBS is needed RBS がなぜ必要か Safe development through type checking Detect invalid method calls, etc., before they are executed use steep check, etc Improved development experience, including complements More accurate completion based on type use steep-vscode or TypeProf for IDE, etc 型検査による安全な開発 不正なメソッド呼び出しなどを実⾏前に検知できる steep check などが利⽤できる 補完などの開発体験の向上 型に基づいてより正確に補完できる steep-vscode やTypeProf for IDE が利⽤できる

Slide 28

Slide 28 text

Existing way to generate RBS from Ruby Ruby からRBS を⽣成する既存⼿法 rbs prototype rb Ruby RBS rbs prototype runtime typeprof static parsing 静的構⽂解析 Ruby RBS dynamic load 動的ロード Ruby type level execution 型レベル実⾏ RBS

Slide 29

Slide 29 text

Each method has its own strengths and weaknesses, and it's difficult to generate a perfect RBS in one shot. ⼿法ごとに⼀⻑⼀短があり、⼀発で完璧なRBS を⽣成するのは難しい

Slide 30

Slide 30 text

rbs prototype rb config.rb Ruby RBS static parsing 静的構⽂解析 class Config def self.configure(&block) new.tap(&block) end %w[hoge fuga piyo].each do |v| attr_accessor v end end # config = Config.configure do |c| # c.hoge = 1 # c.fuga = 'a' # c.piyo = :piyo # end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 config.rb class Config def self.configure: () \ { () -> untyped } -> untyped end # All untyped, and # No attribute accessors 1 2 3 4 5 6 7 config.rbs

Slide 31

Slide 31 text

rbs prototype runtime -R config.rb Config class Config def self.configure(&block) new.tap(&block) end %w[hoge fuga piyo].each do |v| attr_accessor v end end # config = Config.configure do |c| # c.hoge = 1 # c.fuga = 'a' # c.piyo = :piyo # end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 config.rb class Config def self.configure: () \ { (*untyped) -> untyped } -> untyped public def fuga: () -> untyped def fuga=: (untyped) -> untyped def hoge: () -> untyped def hoge=: (untyped) -> untyped def piyo: () -> untyped def piyo=: (untyped) -> untyped end # All Untyped 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 config.rbs Ruby RBS dynamic load 動的ロード

Slide 32

Slide 32 text

typeprof config.rb class Config def self.configure(&block) new.tap(&block) end %w[hoge fuga piyo].each do |v| attr_accessor v end end # Required for Type Level Exec config = Config.configure do |c| c.hoge = 1 c.fuga = 'a' c.piyo = :piyo end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 config.rb class Config def self.configure: { (Config) -> :piyo } end # Typed, but # No attribute accessors # (and I want the return value to be void.) 1 2 3 4 5 6 7 config.rbs Ruby type level execution 型レベル実⾏ RBS

Slide 33

Slide 33 text

It's pretty hard to fix untyped or make up for what's missing. untyped を直したり、⾜りないものを補うのは結構⼤変

Slide 34

Slide 34 text

What I want to do with RBS Goose Ruby Existing Tools 既存ツール RBS RBS Goose で何をしたいか Guessing by LLM Refined RBS LLM による推測 Untyped becomes concrete type The missing methods are compensated for untyped が具体型になり、不⾜メソッドが補われる https://github.com/kokuyouwind/rbs_goose

Slide 35

Slide 35 text

I must first say that it has not been realized to a practical level. 先に断っておくと、実⽤レベルまでは実現できていません

Slide 36

Slide 36 text

Headline ⽬次 Purpose of creating RBS Goose How RBS Goose works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ

Slide 37

Slide 37 text

Example for explanation 説明のための例 class Person attr_reader :name def initialize(name:) @name = name end def name=(name) @name = name end end 1 2 3 4 5 6 7 8 9 10 11 class Person @name: String attr_reader name: String def initialize: (name: String) -> void def name=: (String name) -> void end 1 2 3 4 5 6 7 8 9 lib/person.rb sig/person.rbs

Slide 38

Slide 38 text

RBS Goose Architecture(Infer Type) RBS Goose の構成 ( 型の推測) Ruby RBS Refined RBS rbs prototype (or other tools) examples Prompt LLM (e.g. ChatGPT) RbsGoose::TypeInferrer#infer

Slide 39

Slide 39 text

Ruby RBS Refined RBS examples Prompt LLM (e.g. ChatGPT) class Person @name: untyped attr_reader name: untyped def initialize: (name: untyped) -> void def name=: (untyped name) -> void end 1 2 3 4 5 6 7 8 9 sig/person.rbs RBS Goose Architecture(Infer Type) RBS Goose の構成 ( 型の推測) rbs prototype (or other tools)

Slide 40

Slide 40 text

rbs prototype (or other tools) Ruby RBS Refined RBS examples Prompt LLM (e.g. ChatGPT) class Example1 attr_reader :quantity def initialize(quantity:) @quantity = quantity end def quantity=(quantity) @quantity = quantity end end 1 2 3 4 5 6 7 8 9 10 11 lib/example1.rb class Example1 @quantity: untyped attr_reader quantity: untyped def initialize: (quantity: untyped) -> void def quantity=: (untyped quantity) -> void end 1 2 3 4 5 6 7 8 9 sig/example1.rbs class Example1 @quantity: Integer attr_reader quantity: Integer def initialize: (quantity: Integer) -> void def quantity=: (Integer quantity) -> void end 1 2 3 4 5 6 7 8 9 refined/sig/example1.rbs RBS Goose Architecture(Infer Type) RBS Goose の構成 ( 型の推測)

Slide 41

Slide 41 text

Refined RBS RBS Goose の構成 ( 型の推測) RBS Goose Architecture(Infer Type) When ruby source codes and RBS type signatures are given, refine each RBS type signatures. ======== Input ======== ```lib/example1.rb ... ``` ```sig/example1.rbs ... ``` ======== Output ======== ```sig/example1.rbs ... ``` ======== Input ======== ```lib/person.rb ... ``` ```sig/person.rbs ... ``` ======== Output ======== 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Ruby RBS examples Prompt LLM (e.g. ChatGPT) Examples Ruby Code LLM Infer RBS Prototype rbs prototype (or other tools)

Slide 42

Slide 42 text

Ruby RBS Refined RBS steep prototype examples Prompt LLM (e.g. ChatGPT) ```sig/person.rbs class Person @name: String attr_reader name: String def initialize: (name: String) -> void def name=: (String name) -> void end ``` 1 2 3 4 5 6 7 8 9 10 11 RBS Goose Architecture(Infer Type) RBS Goose の構成 ( 型の推測)

Slide 43

Slide 43 text

Ruby RBS Refined RBS examples Prompt LLM (e.g. ChatGPT) RBS Goose Architecture(Infer Type) RBS Goose の構成 ( 型の推測) rbs prototype (or other tools)

Slide 44

Slide 44 text

Multiple File Handling 複数ファイルの扱い class Person attr_reader :name end 1 2 3 lib/person.rb class PersonName attr_reader :value end 1 2 3 lib/person_name.rb class Person # Not a String @name: PersonName end 1 2 3 4 sig/person.rbs

Slide 45

Slide 45 text

Multiple File Handling: Strategy 複数ファイルの扱い: 戦略 Pass all constants list Combine and pass on files that may be related by RAG Infer all Ruby files at once すべての定数のリストを渡す RAG で関連しそうなファイルを組み合わせて渡す すべてのRuby ファイルをまとめて1 度に推論させる

Slide 46

Slide 46 text

Multiple File Handling: Adopted 複数ファイルの扱い: 選択したもの Infer all Ruby files at once AI can make comprehensive decisions from all codes Small Project can be stores in 128K tokens Unrealistic as of last year, as 4K was the max すべてのRuby ファイルをまとめて1 度に推論させる ⼩さなプロジェクトなら128K トークンに収まる 昨年時点では4K トークンが最⼤だったため⾮現実的だった AI が全てのコードを⾒て総合的に判断できる

Slide 47

Slide 47 text

Act as Ruby type inferrer. When ruby source codes and RBS type signatures are given, refine each RBS type signatures. Each file should be split in markdown code format. Use class names, variable names, etc., to infer type. ========Input======== ```ruby:lib/email.rb class Email # @dynamic address attr_reader :address def initialize(address:) @address = address end def ==(other) other.is_a?(self.class) && other.address == address end def hash self.class.hash ^ address.hash end end ``` ```rbs:sig/email.rbs class Email @address: untyped attr_reader address: untyped def initialize: (address: untyped) -> void def ==: (untyped other) -> untyped def hash: () -> untyped end ``` ```ruby:lib/person.rb class Person # @dynamic name, contacts attr_reader :name attr_reader :contacts def initialize(name:) @name = name @contacts = [] end def name=(name) @name = name end def guess_country() contacts.map do |contact| case contact when Phone contact.country end end.compact.first end end ``` ```rbs:sig/person.rbs class Person @name: untyped @contacts: untyped attr_reader name: untyped attr_reader contacts: untyped def initialize: (name: untyped) -> void def name=: (untyped name) -> void def guess_country: () -> untyped end ``` ```ruby:lib/phone.rb class Phone # @dynamic country, number attr_reader :country, :number def initialize(country:, number:) @country = country @number = number end def ==(other) if other.is_a?(Phone) # @type var other: Phone other.country == country && other.number == number else false end end def hash self.class.hash ^ country.hash ^ number.hash end end ``` ```rbs:sig/phone.rbs class Phone @country: untyped @number: untyped attr_reader country: untyped attr_reader number: untyped def initialize: (country: untyped, number: untyped) -> void def ==: (untyped other) -> (untyped | nil) def hash: () -> untyped end ``` ========Output======== ```rbs:sig/email.rbs class Email @address: String attr_reader address: String def initialize: (address: String) -> void def ==: (Object other) -> bool def hash: () -> Integer end ``` ```rbs:sig/person.rbs class Person @name: String @contacts: Array[(Email | Phone)] attr_reader name: String attr_reader contacts: Array[(Email | Phone)] def initialize: (name: String) -> void def name=: (String name) -> void def guess_country: () -> (String | nil) end ``` ```rbs:sig/phone.rbs class Phone @country: String @number: String attr_reader country: String attr_reader number: String def initialize: (country: String, number: String) -> void def ==: (Object other) -> (bool | nil) def hash: () -> Integer end ``` ========Input======== ```ruby:lib/user.rb class User def initialize(name:) @name = name end attr_reader :name end ``` ```rbs:sig/user.rbs class User @name: untyped def initialize: (name: untyped) -> void attr_reader name: untyped end ``` ```ruby:lib/user_factory.rb class UserFactory def name(name) @name = name self end def build User.new(name: @name) end end ``` ```rbs:sig/user_factory.rbs class UserFactory @name: untyped def name: (untyped name) -> self def build: () -> untyped end ``` ========Output======== Real Prompt 実際のプロンプト

Slide 48

Slide 48 text

Most of the time, one shot doesn't work. We need to look at type errors and eventually correct them. たいてい、⼀発ではうまくいかない。 発⽣した型エラーを⾒ながら修正していく必要がある。

Slide 49

Slide 49 text

Ruby RBS Fixed RBS examples Prompt LLM (e.g. ChatGPT) RBS Goose Architecture(Fix Error) RBS Goose の構成 ( エラーの修正) ❌ Errors Steep Check ❌ (Still experimental) まだ実験的 RbsGoose::TypeInferrer#fix_error

Slide 50

Slide 50 text

Headline ⽬次 Purpose of creating RBS Goose How RBS Goose works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ

Slide 51

Slide 51 text

RBS Goose Configuration RBS Goose の設定 I want to allow users to choose which LLM to use. どの LLM を使うか、ユーザーが選べるようにしたい

Slide 52

Slide 52 text

Using the LLM framework as an adapter LLM フレームワークをアダプタとして利⽤ Use by Langchain.rb gem @andreibondarev @andreibondarev ⽒の Langchain.rb gem を利⽤する

Slide 53

Slide 53 text

Ruby RBS Goose Configuration Example RBS Goose の設定例 api_key = ENV.fetch('OPENAI_ACCESS_TOKEN') RbsGoose.configure do |c| # Use the provided configuration methods c.use_open_ai(api_key) # or directly configure an instance of Langchain::LLM c.llm.client = ::Langchain::LLM::OpenAI.new(api_key: ) # or Local Server such as Ollama c.llm.client = ::Langchain::LLM::Ollama.new( url: "http://localhost:11434" ) end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ref: RbsGooseTest Rakefile Setups

Slide 54

Slide 54 text

LLM API is Expensive and High latency Critically unsuitable for CI. Web mocks such as VCR gem can be used Make it an exact match, including Request Body Temperature should be set to 0 for reproducibility RBS Goose Testing RBS Goose のテスト LLM API は費⽤が⾼く応答も遅い CI と致命的に相性が悪い VCR gem などの Web モックを利⽤すると良い リクエストボディを含めた厳密⼀致を指定する 再現性のために、temperature は0 にする

Slide 55

Slide 55 text

RBS Goose Testing - VCR Setup RBS Goose のテスト - VCR セットアップ # spec/spec_helper.rb VCR.configure do |config| config.cassette_library_dir = 'spec/fixtures/vcr_cassettes' config.hook_into :webmock config.default_cassette_options = { match_requests_on: %i[method uri body], record: ENV.fetch('RECORD', :once).to_sym } config.filter_sensitive_data('') { ENV.fetch('OPENAI_ACCESS_TOKEN') } end 1 2 3 4 5 6 7 8 9 10 11 12 13 ref: spec/spec_helper.rb

Slide 56

Slide 56 text

RBS Goose Testing - VCR Usage RBS Goose のテスト - VCR の利⽤ # spec/rbs_goose/type_inferrer_spec.rb RSpec.describe RbsGoose::TypeInferrer, :configure do it 'returns refined rbs' do VCR.use_cassette('openai/infer') do expect(described_class.new.infer).to eq(refined_rbs_list) end end end 1 2 3 4 5 6 7 8 9 ref: spec/rbs_goose/type_inferrer_spec.rb

Slide 57

Slide 57 text

Recorded Request Example 記録されたリクエストの例 --- http_interactions: - request: method: post uri: https://api.openai.com/v1/chat/completions body: encoding: UTF-8 string: '{"messages":[ {"role":"user","content":"Act as Ruby type inferrer..."}], "model":"gpt-3.5-turbo-1106","n":1, "temperature":0.0}' headers: Content-Type: - application/json Authorization: - Bearer ... response: ... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ref: spec/fixtures/vcr_cassettes/ollama_codegemma_chat/infer_user_factory.yml

Slide 58

Slide 58 text

Headline ⽬次 Purpose of creating RBS Goose How RBS Goose works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ

Slide 59

Slide 59 text

Evaluation 1: Config & Runner 評価 1: Config & Runner class Config def self.configure(&block) new.tap(&block) end %w[client role prompt].each do attr_accessor _1.to_sym end end 1 2 3 4 5 6 7 8 9 class Runner def initialize(config) @config = config end def run config.client.chat( messages: [{ role: config.role, content: config.prompt }] ).chat_completion end private attr_reader :config end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 lib/config.rb lib/runner.rb kokuyouwind/rbs_goose_test case1/lib

Slide 60

Slide 60 text

Evaluation 1: Config & Runner 評価 1: Config & Runner Let RBS Goose guess a small example involving metaprogramming The base RBS is generated by each of the three methods Tried OpenAI and Anthropic models + CodeGemma (local LLM) steep check + Quality checks by read the RBS Check if there are any untyped left that can be detailed, etc. メタプログラミングを含む⼩さな例を推測させた ベースとなるRBS は、事前に解説した3 種類の⼿法で⽣成 OpenAI とAnthropic の各モデル + CodeGemma( ローカルLLM) を試した kokuyouwind/rbs_goose_test case1 steep check の確認に加えて、⽬視での品質確認を実施 まだ具体化できるuntyped が残されていないか、などを確認

Slide 61

Slide 61 text

Result 1: Generated RBS Quality 結果 1: ⽣成されたRBS の質 Platform Model Size prototype rb base prototype runtime base Typeprof base OpenAI GPT-3.5 Turbo Small OpenAI GPT-4 Turbo Large OpenAI GPT-4 Omni Large Anthropic Claude 3 Haiku Small Anthropic Claude 3 Sonnet Medium Anthropic Claude 3 Opus Large Ollama(Local) CodeGemma Small Perfect Perfect Almost Perfect Perfect Perfect Perfect Perfect Perfect Perfect Almost Almost Almost Not Good Perfect Almost Almost Almost Not Good Not Good Not Good

Slide 62

Slide 62 text

Result 1: Attribute Accessors 結果 1: アトリビュートアクセサ rbs prototype runtime + OpenAI gpt-3.5-turbo Regardless of the base, the output was the same. ベースのRBS を問わず、同じような出⼒になった

Slide 63

Slide 63 text

Result 1: Almost Example 結果 1: Almost の例 rbs prototype rb + Anthropic Claude 3 Sonnet rbs prototype rb + Anthropic Claude 3 Opus There was a case of fabricating the return type of LangChain::LLM::OpenAI#chat LangChain::LLM::OpenAI#chat の返り値の型を捏造することがあった

Slide 64

Slide 64 text

Result 1: Interesting Case 結果 1: 興味深いケース In one case, gpt-4-turbo commented on why it was left untyped 1 例だけ、 gpt-4-turbo が 「なぜuntyped のまま残したか」をコメントしているものがあった rbs prototype runtime + OpenAI gpt-4-turbo

Slide 65

Slide 65 text

Result 1: Execution Time [sec] 結果 1: 実⾏時間 [ 秒] Platform Model Size prototype rb base prototype runtime base Typeprof base OpenAI GPT-3.5 Turbo Small 2.2 2.2 4.8 OpenAI GPT-4 Turbo Large 7.6 11.4 7.2 OpenAI GPT-4 Omni Large 1.8 1.7 1.9 Anthropic Claude 3 Haiku Small 3.3 3.3 2.9 Anthropic Claude 3 Sonnet Medium 3.5 8.6 3.0 Anthropic Claude 3 Opus Large 14.6 13.0 13.1 Ollama(Local) CodeGemma Small 7.1 7.4 4.1

Slide 66

Slide 66 text

Platform Model Size prototype rb base prototype runtime base Typeprof base OpenAI GPT-3.5 Turbo Small (2.2) (2.2) (4.8) OpenAI GPT-4 Turbo Large (7.6) (11.4) (7.2) OpenAI GPT-4 Omni Large (1.8) (1.7) (1.9) Anthropic Claude 3 Haiku Small (3.3) (3.3) (2.9) Anthropic Claude 3 Sonnet Medium (3.5) - (3.0) Anthropic Claude 3 Opus Large (14.6) (7.4) (4.1) Perfect Perfect Almost Perfect Perfect Perfect Perfect Perfect Perfect Perfect Almost Almost Almost Perfect Almost Almost Almost Result 1: Time (Perfect or Almost) 結果 1: 実⾏時間(Perfect かAlmost のもののみ)

Slide 67

Slide 67 text

Evaluation 1: Consideration 実験 1: 考察 The base is much the same for all methods Looks good to focus on rbs prototype rb For the model, the GPT system clearly performed better GPT-4 Omni was the fastest but ideal output rbs prototype rb + GPT-4 Omni combination looks good 元となるRBS ⽣成⼿法はどれにしても⼤差なかった GPT-4 Omni が最速なのに理想的な出⼒だった 実⾏が⼿軽で速い rbs prototype rb に絞っても良さそう rbs prototype rb + GPT-4 Omni の組み合わせが良さそう モデルは GPT 系の成績が明らかに良かった

Slide 68

Slide 68 text

Evaluation 2: RbsGoose 評価 2: RbsGoose Infer RBS from Ruby code in whole RbsGoose The base used only rbs prototype rb RbsGoose のRuby コード全体からRBS を推測する ベースはrbs prototype rb のみを⽤いた # File Count ❯ find lib -type f | wc -l 17 # Line Count ❯ find lib -type f | xargs cat | wc -l 698 # Size Count ❯ du -sh lib 68K lib 1 2 3 4 5 6 7 8 9 10 11

Slide 69

Slide 69 text

Platform model Model Size Quality time[sec] cost[¢] OpenAI GPT-3.5 Turbo Small 4.3 0.44 OpenAI GPT-4 Turbo Large 69.2 12.6 OpenAI GPT-4 Omni Large 52.5 7.86 Anthropic Claude 3 Haiku Small 33.4 0.65 Anthropic Claude 3 Sonnet Medium 55.5 7.88 Anthropic Claude 3 Opus Large 90.7 35.72 Ollama(Local) codegemma Small 95.9 N/A Poor Almost Almost Poor Almost Almost Subtle Result 2: Generated RBS Quality 結果 2: ⽣成されたRBS の質

Slide 70

Slide 70 text

Result 2: Almost Summary 結果 2: Almost な出⼒の概要 Overall, well guessed, including generics. 全体的にはジェネリクスも含めてよく推測されている class RbsGoose::IO::ExampleGroup < ::Array[RbsGoose::IO::Example] self.@default_examples: Hash[Symbol, RbsGoose::IO::ExampleGroup] attr_accessor error_messages: String? def self.load_from: (String base_path, ?code_dir: String, ?sig_dir: String, ?refined_dir: String) -> RbsGoose::IO::ExampleGroup def self.default_examples: () -> Hash[Symbol, RbsGoose::IO::ExampleGroup] private def self.load_example: (String base_path, String code_dir, String path, String refined_dir, String sig_dir) -> RbsGoose::IO::Example private def self.to_rbs_path: (String path, String sig_dir) -> String def to_target_group: () -> RbsGoose::IO::TargetGroup def to_refined_rbs_list: () -> Array[RbsGoose::IO::File] end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 sig/rbs_goose/io/example_group.rbs ref: OpenAI GPT-4-omni でのsig:refine 結果

Slide 71

Slide 71 text

Evaluation 2: Failure Point 評価 2: 失敗していたポイント Failure Description: Syntax Error in Struct or delegator 失敗内容: Struct や def_delegator で Syntax Error class RbsGoose::Configuration LLMConfig: Struct[client: ::Langchain::LLM::Base, ... TemplateConfig: Struct[instruction: String, ... def_delegator llm, :client, :llm_client def_delegator llm, :mode, :llm_mode ... 1 2 3 4 5 6 7 sig/rbs_goose/configuration.rbs

Slide 72

Slide 72 text

Evaluation 2: What happens when I fix it 評価 2: それらを直したらどうなるか

Slide 73

Slide 73 text

🙃

Slide 74

Slide 74 text

Evaluation 2: Consideration 実験 2: 考察 Cannot handle RBS for special cases such as Struct well Necessary to include it in the example, or require Fine Tuning The 1:1 assumption of ruby and rbs was not a good rbs_rails, typeprof, etc. generate RBS at the top level I still want a fix for type errors Struct などの特殊ケースのRBS をうまく扱えない rbs_rails やtypeprof などはトップレベルにRBS を⽣成するので対応が取れない example に含めるか, Fine Tuning を⾏う必要がありそう やっぱり型エラーの⾃動修正が欲しい ruby とrbs を⼀対⼀の前提にしたのはあまり良くなかった

Slide 75

Slide 75 text

Headline ⽬次 Purpose of creating RBS Goose How RBS Goose works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ

Slide 76

Slide 76 text

Conclusion まとめ Introduced a case study of the creation of the RBS Goose Explained how to compose the prompt and the intent Some tips for development with LLM were presented RBS Goose is still experimental LLM could be used to do some interesting things RBS Goose を作った事例を紹介した プロンプトの構成⽅法と、その意図について解説した LLM を使った開発のTips をいくつか紹介した LLM を使うと⾯⽩いことができるかも、というのが伝わると嬉しい RBS Goose はまだ実験段階

Slide 77

Slide 77 text

Concerns 気になっていること

Slide 78

Slide 78 text

Tomorrow's Session Schedule 明⽇のセッションスケジュール

Slide 79

Slide 79 text

rbs-inline https://github.com/soutaro/rbs-inline

Slide 80

Slide 80 text

Seems to work well with AI completion AI 補完と相性が良さそう Completion I tried and it completes quite well. GitHub Copilot GitHub Copilot を試したら、結構補完してくれそう Editing entire projects with AI could work well like or Open Interpreter Copilot Workspace Open Interpreter や Copilot Workspace など、 AI でプロジェクト全体を編集する戦略もやりやすくなりそう

Slide 81

Slide 81 text

Is RBS Goose dead? RBS Goose 死亡のお知らせ?

Slide 82

Slide 82 text

I'm not sure yet whether the RBS Goose will become a dead duck or a goose that lays golden eggs. So I'll keep at it a little longer before I cooks my own goose. RBS Goose が Dead Duck になる ( 失敗に終わる) のか、 それとも⾦の卵を⽣むガチョウになるのかはまだわからないので、 Cook my own goose( ⾃分で成功の機会を捨てる) 前に もう少し続けてみたいと思う。