Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Let's use LLMs from Ruby 〜 Refine RBS types usi...

Let's use LLMs from Ruby 〜 Refine RBS types using LLM 〜

RubyKaigi 2024 での登壇資料です。
PDF化で日本語フォントが化けてしまうため、以下のオリジナルのほうが見やすいです。

Original: https://slides.com/kokuyouwind/lets-use-llms-from-ruby

kokuyouwind

May 15, 2024
Tweet

More Decks by kokuyouwind

Other Decks in Programming

Transcript

  1. Let's use LLMs from Ruby Leaner Technologies, Inc. Kokuyou (

    黒曜) / Shunsuke Mori @kokuyouwind ~ Refine RBS types using LLM ~
  2. $ whoami Kokuyou ( 黒曜) / Shunsuke Mori Twitter or

    X: @kokuyouwind Work at: Leaner Technologies, Inc. Platinum / Drinkup Sponsor Day 2: Leaner YAKINIKU Party Procurement Domain BtoB Startup 調達領域のBtoB スタートアップ
  3. I'll talk about one that has no relation to my

    job at all. 業務とは全然関係のない話をします
  4. Oh no, Matz presented on the same topic I was

    planning to cover! https://twitter.com/kokuyouwind/status/1656488513666453505
  5. Large Language Model (LLM) Machine learning models trained on large

    amounts of text data. OpenAI ChatGPT 4 Anthropic Claude 3 Microsoft GitHub Copilot ⼤量のテキストデータを使ってトレーニングされた機械学習モデル ⼤規模⾔語モデル
  6. Evolution of LLM since last year 昨年からのLLM の進化 Price Reduction

    💰 値下げ Sentence Length Expansion 扱える⽂章⻑の拡⼤ x 1/8 📚 x 480 Improvement of coding skills コーディング能⼒の向上 + 23.2 pt (OpenAI gpt-3.5-turbo-0301 to Anthropic Claude3 Haiku) (OpenAI gpt-3.5-turbo-0301 to Google Gemini 1.5 Pro 2M) (GPT-4 to GPT-4o, ) HumanEval 4096 → 2M $2.0 → $.025 / 1MTok 67.0% → 90.2% 📝
  7. 🤔 Now that LLM's capabilities have increased so much, can

    we even guess RBS types for the entire project? これだけLLM の能⼒が上がったなら、 プロジェクト全体のRBS の型推測もワンチャンいけるのでは… !?
  8. 🤨 Now that LLM's capabilities have increased so much, can

    we even guess RBS types for the entire project? これだけLLM の能⼒が上がったなら、 プロジェクト全体のRBS の型推測もワンチャンいけるのでは… !?
  9. 😲 Now that LLM's capabilities have increased so much, can

    we even guess RBS types for the entire project? Goose
  10. I talk about the creation of RBS Goose, a tool

    to guess RBS types from Ruby. Ruby コードからRBS 型を推測するツール、 RBS Goose を作った話をします
  11. Headline ⽬次 Purpose of creating RBS Goose How RBS Goose

    works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ
  12. Headline ⽬次 Purpose of creating RBS Goose How RBS Goose

    works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ
  13. I thought it would be interesting if I could guess

    the type with LLM. LLM で型が推測できたら⾯⽩そうだったから。
  14. What is RBS? RBS とは? Language for defining Ruby type

    structures Ruby の型構造を定義するための⾔語 class Person attr_reader :name def initialize(name:) @name = name end def name=(name) @name = name end end 1 2 3 4 5 6 7 8 9 10 11 class Person @name: String attr_reader name: String def initialize: (name: String) -> void def name=: (String name) -> void end 1 2 3 4 5 6 7 8 9 person.rb person.rbs https://github.com/ruby/rbs
  15. Why RBS is needed RBS がなぜ必要か Safe development through type

    checking Detect invalid method calls, etc., before they are executed use steep check, etc Improved development experience, including complements More accurate completion based on type use steep-vscode or TypeProf for IDE, etc 型検査による安全な開発 不正なメソッド呼び出しなどを実⾏前に検知できる steep check などが利⽤できる 補完などの開発体験の向上 型に基づいてより正確に補完できる steep-vscode やTypeProf for IDE が利⽤できる
  16. Existing way to generate RBS from Ruby Ruby からRBS を⽣成する既存⼿法

    rbs prototype rb Ruby RBS rbs prototype runtime typeprof static parsing 静的構⽂解析 Ruby RBS dynamic load 動的ロード Ruby type level execution 型レベル実⾏ RBS
  17. Each method has its own strengths and weaknesses, and it's

    difficult to generate a perfect RBS in one shot. ⼿法ごとに⼀⻑⼀短があり、⼀発で完璧なRBS を⽣成するのは難しい
  18. rbs prototype rb config.rb Ruby RBS static parsing 静的構⽂解析 class

    Config def self.configure(&block) new.tap(&block) end %w[hoge fuga piyo].each do |v| attr_accessor v end end # config = Config.configure do |c| # c.hoge = 1 # c.fuga = 'a' # c.piyo = :piyo # end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 config.rb class Config def self.configure: () \ { () -> untyped } -> untyped end # All untyped, and # No attribute accessors 1 2 3 4 5 6 7 config.rbs
  19. rbs prototype runtime -R config.rb Config class Config def self.configure(&block)

    new.tap(&block) end %w[hoge fuga piyo].each do |v| attr_accessor v end end # config = Config.configure do |c| # c.hoge = 1 # c.fuga = 'a' # c.piyo = :piyo # end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 config.rb class Config def self.configure: () \ { (*untyped) -> untyped } -> untyped public def fuga: () -> untyped def fuga=: (untyped) -> untyped def hoge: () -> untyped def hoge=: (untyped) -> untyped def piyo: () -> untyped def piyo=: (untyped) -> untyped end # All Untyped 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 config.rbs Ruby RBS dynamic load 動的ロード
  20. typeprof config.rb class Config def self.configure(&block) new.tap(&block) end %w[hoge fuga

    piyo].each do |v| attr_accessor v end end # Required for Type Level Exec config = Config.configure do |c| c.hoge = 1 c.fuga = 'a' c.piyo = :piyo end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 config.rb class Config def self.configure: { (Config) -> :piyo } end # Typed, but # No attribute accessors # (and I want the return value to be void.) 1 2 3 4 5 6 7 config.rbs Ruby type level execution 型レベル実⾏ RBS
  21. It's pretty hard to fix untyped or make up for

    what's missing. untyped を直したり、⾜りないものを補うのは結構⼤変
  22. What I want to do with RBS Goose Ruby Existing

    Tools 既存ツール RBS RBS Goose で何をしたいか Guessing by LLM Refined RBS LLM による推測 Untyped becomes concrete type The missing methods are compensated for untyped が具体型になり、不⾜メソッドが補われる https://github.com/kokuyouwind/rbs_goose
  23. I must first say that it has not been realized

    to a practical level. 先に断っておくと、実⽤レベルまでは実現できていません
  24. Headline ⽬次 Purpose of creating RBS Goose How RBS Goose

    works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ
  25. Example for explanation 説明のための例 class Person attr_reader :name def initialize(name:)

    @name = name end def name=(name) @name = name end end 1 2 3 4 5 6 7 8 9 10 11 class Person @name: String attr_reader name: String def initialize: (name: String) -> void def name=: (String name) -> void end 1 2 3 4 5 6 7 8 9 lib/person.rb sig/person.rbs
  26. RBS Goose Architecture(Infer Type) RBS Goose の構成 ( 型の推測) Ruby

    RBS Refined RBS rbs prototype (or other tools) examples Prompt LLM (e.g. ChatGPT) RbsGoose::TypeInferrer#infer
  27. Ruby RBS Refined RBS examples Prompt LLM (e.g. ChatGPT) class

    Person @name: untyped attr_reader name: untyped def initialize: (name: untyped) -> void def name=: (untyped name) -> void end 1 2 3 4 5 6 7 8 9 sig/person.rbs RBS Goose Architecture(Infer Type) RBS Goose の構成 ( 型の推測) rbs prototype (or other tools)
  28. rbs prototype (or other tools) Ruby RBS Refined RBS examples

    Prompt LLM (e.g. ChatGPT) class Example1 attr_reader :quantity def initialize(quantity:) @quantity = quantity end def quantity=(quantity) @quantity = quantity end end 1 2 3 4 5 6 7 8 9 10 11 lib/example1.rb class Example1 @quantity: untyped attr_reader quantity: untyped def initialize: (quantity: untyped) -> void def quantity=: (untyped quantity) -> void end 1 2 3 4 5 6 7 8 9 sig/example1.rbs class Example1 @quantity: Integer attr_reader quantity: Integer def initialize: (quantity: Integer) -> void def quantity=: (Integer quantity) -> void end 1 2 3 4 5 6 7 8 9 refined/sig/example1.rbs RBS Goose Architecture(Infer Type) RBS Goose の構成 ( 型の推測)
  29. Refined RBS RBS Goose の構成 ( 型の推測) RBS Goose Architecture(Infer

    Type) When ruby source codes and RBS type signatures are given, refine each RBS type signatures. ======== Input ======== ```lib/example1.rb ... ``` ```sig/example1.rbs ... ``` ======== Output ======== ```sig/example1.rbs ... ``` ======== Input ======== ```lib/person.rb ... ``` ```sig/person.rbs ... ``` ======== Output ======== 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Ruby RBS examples Prompt LLM (e.g. ChatGPT) Examples Ruby Code LLM Infer RBS Prototype rbs prototype (or other tools)
  30. Ruby RBS Refined RBS steep prototype examples Prompt LLM (e.g.

    ChatGPT) ```sig/person.rbs class Person @name: String attr_reader name: String def initialize: (name: String) -> void def name=: (String name) -> void end ``` 1 2 3 4 5 6 7 8 9 10 11 RBS Goose Architecture(Infer Type) RBS Goose の構成 ( 型の推測)
  31. Ruby RBS Refined RBS examples Prompt LLM (e.g. ChatGPT) RBS

    Goose Architecture(Infer Type) RBS Goose の構成 ( 型の推測) rbs prototype (or other tools)
  32. Multiple File Handling 複数ファイルの扱い class Person attr_reader :name end 1

    2 3 lib/person.rb class PersonName attr_reader :value end 1 2 3 lib/person_name.rb class Person # Not a String @name: PersonName end 1 2 3 4 sig/person.rbs
  33. Multiple File Handling: Strategy 複数ファイルの扱い: 戦略 Pass all constants list

    Combine and pass on files that may be related by RAG Infer all Ruby files at once すべての定数のリストを渡す RAG で関連しそうなファイルを組み合わせて渡す すべてのRuby ファイルをまとめて1 度に推論させる
  34. Multiple File Handling: Adopted 複数ファイルの扱い: 選択したもの Infer all Ruby files

    at once AI can make comprehensive decisions from all codes Small Project can be stores in 128K tokens Unrealistic as of last year, as 4K was the max すべてのRuby ファイルをまとめて1 度に推論させる ⼩さなプロジェクトなら128K トークンに収まる 昨年時点では4K トークンが最⼤だったため⾮現実的だった AI が全てのコードを⾒て総合的に判断できる
  35. Act as Ruby type inferrer. When ruby source codes and

    RBS type signatures are given, refine each RBS type signatures. Each file should be split in markdown code format. Use class names, variable names, etc., to infer type. ========Input======== ```ruby:lib/email.rb class Email # @dynamic address attr_reader :address def initialize(address:) @address = address end def ==(other) other.is_a?(self.class) && other.address == address end def hash self.class.hash ^ address.hash end end ``` ```rbs:sig/email.rbs class Email @address: untyped attr_reader address: untyped def initialize: (address: untyped) -> void def ==: (untyped other) -> untyped def hash: () -> untyped end ``` ```ruby:lib/person.rb class Person # @dynamic name, contacts attr_reader :name attr_reader :contacts def initialize(name:) @name = name @contacts = [] end def name=(name) @name = name end def guess_country() contacts.map do |contact| case contact when Phone contact.country end end.compact.first end end ``` ```rbs:sig/person.rbs class Person @name: untyped @contacts: untyped attr_reader name: untyped attr_reader contacts: untyped def initialize: (name: untyped) -> void def name=: (untyped name) -> void def guess_country: () -> untyped end ``` ```ruby:lib/phone.rb class Phone # @dynamic country, number attr_reader :country, :number def initialize(country:, number:) @country = country @number = number end def ==(other) if other.is_a?(Phone) # @type var other: Phone other.country == country && other.number == number else false end end def hash self.class.hash ^ country.hash ^ number.hash end end ``` ```rbs:sig/phone.rbs class Phone @country: untyped @number: untyped attr_reader country: untyped attr_reader number: untyped def initialize: (country: untyped, number: untyped) -> void def ==: (untyped other) -> (untyped | nil) def hash: () -> untyped end ``` ========Output======== ```rbs:sig/email.rbs class Email @address: String attr_reader address: String def initialize: (address: String) -> void def ==: (Object other) -> bool def hash: () -> Integer end ``` ```rbs:sig/person.rbs class Person @name: String @contacts: Array[(Email | Phone)] attr_reader name: String attr_reader contacts: Array[(Email | Phone)] def initialize: (name: String) -> void def name=: (String name) -> void def guess_country: () -> (String | nil) end ``` ```rbs:sig/phone.rbs class Phone @country: String @number: String attr_reader country: String attr_reader number: String def initialize: (country: String, number: String) -> void def ==: (Object other) -> (bool | nil) def hash: () -> Integer end ``` ========Input======== ```ruby:lib/user.rb class User def initialize(name:) @name = name end attr_reader :name end ``` ```rbs:sig/user.rbs class User @name: untyped def initialize: (name: untyped) -> void attr_reader name: untyped end ``` ```ruby:lib/user_factory.rb class UserFactory def name(name) @name = name self end def build User.new(name: @name) end end ``` ```rbs:sig/user_factory.rbs class UserFactory @name: untyped def name: (untyped name) -> self def build: () -> untyped end ``` ========Output======== Real Prompt 実際のプロンプト
  36. Most of the time, one shot doesn't work. We need

    to look at type errors and eventually correct them. たいてい、⼀発ではうまくいかない。 発⽣した型エラーを⾒ながら修正していく必要がある。
  37. Ruby RBS Fixed RBS examples Prompt LLM (e.g. ChatGPT) RBS

    Goose Architecture(Fix Error) RBS Goose の構成 ( エラーの修正) ❌ Errors Steep Check ❌ (Still experimental) まだ実験的 RbsGoose::TypeInferrer#fix_error
  38. Headline ⽬次 Purpose of creating RBS Goose How RBS Goose

    works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ
  39. RBS Goose Configuration RBS Goose の設定 I want to allow

    users to choose which LLM to use. どの LLM を使うか、ユーザーが選べるようにしたい
  40. Using the LLM framework as an adapter LLM フレームワークをアダプタとして利⽤ Use

    by Langchain.rb gem @andreibondarev @andreibondarev ⽒の Langchain.rb gem を利⽤する
  41. Ruby RBS Goose Configuration Example RBS Goose の設定例 api_key =

    ENV.fetch('OPENAI_ACCESS_TOKEN') RbsGoose.configure do |c| # Use the provided configuration methods c.use_open_ai(api_key) # or directly configure an instance of Langchain::LLM c.llm.client = ::Langchain::LLM::OpenAI.new(api_key: ) # or Local Server such as Ollama c.llm.client = ::Langchain::LLM::Ollama.new( url: "http://localhost:11434" ) end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ref: RbsGooseTest Rakefile Setups
  42. LLM API is Expensive and High latency Critically unsuitable for

    CI. Web mocks such as VCR gem can be used Make it an exact match, including Request Body Temperature should be set to 0 for reproducibility RBS Goose Testing RBS Goose のテスト LLM API は費⽤が⾼く応答も遅い CI と致命的に相性が悪い VCR gem などの Web モックを利⽤すると良い リクエストボディを含めた厳密⼀致を指定する 再現性のために、temperature は0 にする
  43. RBS Goose Testing - VCR Setup RBS Goose のテスト -

    VCR セットアップ # spec/spec_helper.rb VCR.configure do |config| config.cassette_library_dir = 'spec/fixtures/vcr_cassettes' config.hook_into :webmock config.default_cassette_options = { match_requests_on: %i[method uri body], record: ENV.fetch('RECORD', :once).to_sym } config.filter_sensitive_data('<openai_access_token>') { ENV.fetch('OPENAI_ACCESS_TOKEN') } end 1 2 3 4 5 6 7 8 9 10 11 12 13 ref: spec/spec_helper.rb
  44. RBS Goose Testing - VCR Usage RBS Goose のテスト -

    VCR の利⽤ # spec/rbs_goose/type_inferrer_spec.rb RSpec.describe RbsGoose::TypeInferrer, :configure do it 'returns refined rbs' do VCR.use_cassette('openai/infer') do expect(described_class.new.infer).to eq(refined_rbs_list) end end end 1 2 3 4 5 6 7 8 9 ref: spec/rbs_goose/type_inferrer_spec.rb
  45. Recorded Request Example 記録されたリクエストの例 --- http_interactions: - request: method: post

    uri: https://api.openai.com/v1/chat/completions body: encoding: UTF-8 string: '{"messages":[ {"role":"user","content":"Act as Ruby type inferrer..."}], "model":"gpt-3.5-turbo-1106","n":1, "temperature":0.0}' headers: Content-Type: - application/json Authorization: - Bearer <openai_access_token> ... response: ... 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ref: spec/fixtures/vcr_cassettes/ollama_codegemma_chat/infer_user_factory.yml
  46. Headline ⽬次 Purpose of creating RBS Goose How RBS Goose

    works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ
  47. Evaluation 1: Config & Runner 評価 1: Config & Runner

    class Config def self.configure(&block) new.tap(&block) end %w[client role prompt].each do attr_accessor _1.to_sym end end 1 2 3 4 5 6 7 8 9 class Runner def initialize(config) @config = config end def run config.client.chat( messages: [{ role: config.role, content: config.prompt }] ).chat_completion end private attr_reader :config end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 lib/config.rb lib/runner.rb kokuyouwind/rbs_goose_test case1/lib
  48. Evaluation 1: Config & Runner 評価 1: Config & Runner

    Let RBS Goose guess a small example involving metaprogramming The base RBS is generated by each of the three methods Tried OpenAI and Anthropic models + CodeGemma (local LLM) steep check + Quality checks by read the RBS Check if there are any untyped left that can be detailed, etc. メタプログラミングを含む⼩さな例を推測させた ベースとなるRBS は、事前に解説した3 種類の⼿法で⽣成 OpenAI とAnthropic の各モデル + CodeGemma( ローカルLLM) を試した kokuyouwind/rbs_goose_test case1 steep check の確認に加えて、⽬視での品質確認を実施 まだ具体化できるuntyped が残されていないか、などを確認
  49. Result 1: Generated RBS Quality 結果 1: ⽣成されたRBS の質 Platform

    Model Size prototype rb base prototype runtime base Typeprof base OpenAI GPT-3.5 Turbo Small OpenAI GPT-4 Turbo Large OpenAI GPT-4 Omni Large Anthropic Claude 3 Haiku Small Anthropic Claude 3 Sonnet Medium Anthropic Claude 3 Opus Large Ollama(Local) CodeGemma Small Perfect Perfect Almost Perfect Perfect Perfect Perfect Perfect Perfect Perfect Almost Almost Almost Not Good Perfect Almost Almost Almost Not Good Not Good Not Good
  50. Result 1: Attribute Accessors 結果 1: アトリビュートアクセサ rbs prototype runtime

    + OpenAI gpt-3.5-turbo Regardless of the base, the output was the same. ベースのRBS を問わず、同じような出⼒になった
  51. Result 1: Almost Example 結果 1: Almost の例 rbs prototype

    rb + Anthropic Claude 3 Sonnet rbs prototype rb + Anthropic Claude 3 Opus There was a case of fabricating the return type of LangChain::LLM::OpenAI#chat LangChain::LLM::OpenAI#chat の返り値の型を捏造することがあった
  52. Result 1: Interesting Case 結果 1: 興味深いケース In one case,

    gpt-4-turbo commented on why it was left untyped 1 例だけ、 gpt-4-turbo が 「なぜuntyped のまま残したか」をコメントしているものがあった rbs prototype runtime + OpenAI gpt-4-turbo
  53. Result 1: Execution Time [sec] 結果 1: 実⾏時間 [ 秒]

    Platform Model Size prototype rb base prototype runtime base Typeprof base OpenAI GPT-3.5 Turbo Small 2.2 2.2 4.8 OpenAI GPT-4 Turbo Large 7.6 11.4 7.2 OpenAI GPT-4 Omni Large 1.8 1.7 1.9 Anthropic Claude 3 Haiku Small 3.3 3.3 2.9 Anthropic Claude 3 Sonnet Medium 3.5 8.6 3.0 Anthropic Claude 3 Opus Large 14.6 13.0 13.1 Ollama(Local) CodeGemma Small 7.1 7.4 4.1
  54. Platform Model Size prototype rb base prototype runtime base Typeprof

    base OpenAI GPT-3.5 Turbo Small (2.2) (2.2) (4.8) OpenAI GPT-4 Turbo Large (7.6) (11.4) (7.2) OpenAI GPT-4 Omni Large (1.8) (1.7) (1.9) Anthropic Claude 3 Haiku Small (3.3) (3.3) (2.9) Anthropic Claude 3 Sonnet Medium (3.5) - (3.0) Anthropic Claude 3 Opus Large (14.6) (7.4) (4.1) Perfect Perfect Almost Perfect Perfect Perfect Perfect Perfect Perfect Perfect Almost Almost Almost Perfect Almost Almost Almost Result 1: Time (Perfect or Almost) 結果 1: 実⾏時間(Perfect かAlmost のもののみ)
  55. Evaluation 1: Consideration 実験 1: 考察 The base is much

    the same for all methods Looks good to focus on rbs prototype rb For the model, the GPT system clearly performed better GPT-4 Omni was the fastest but ideal output rbs prototype rb + GPT-4 Omni combination looks good 元となるRBS ⽣成⼿法はどれにしても⼤差なかった GPT-4 Omni が最速なのに理想的な出⼒だった 実⾏が⼿軽で速い rbs prototype rb に絞っても良さそう rbs prototype rb + GPT-4 Omni の組み合わせが良さそう モデルは GPT 系の成績が明らかに良かった
  56. Evaluation 2: RbsGoose 評価 2: RbsGoose Infer RBS from Ruby

    code in whole RbsGoose The base used only rbs prototype rb RbsGoose のRuby コード全体からRBS を推測する ベースはrbs prototype rb のみを⽤いた # File Count ❯ find lib -type f | wc -l 17 # Line Count ❯ find lib -type f | xargs cat | wc -l 698 # Size Count ❯ du -sh lib 68K lib 1 2 3 4 5 6 7 8 9 10 11
  57. Platform model Model Size Quality time[sec] cost[¢] OpenAI GPT-3.5 Turbo

    Small 4.3 0.44 OpenAI GPT-4 Turbo Large 69.2 12.6 OpenAI GPT-4 Omni Large 52.5 7.86 Anthropic Claude 3 Haiku Small 33.4 0.65 Anthropic Claude 3 Sonnet Medium 55.5 7.88 Anthropic Claude 3 Opus Large 90.7 35.72 Ollama(Local) codegemma Small 95.9 N/A Poor Almost Almost Poor Almost Almost Subtle Result 2: Generated RBS Quality 結果 2: ⽣成されたRBS の質
  58. Result 2: Almost Summary 結果 2: Almost な出⼒の概要 Overall, well

    guessed, including generics. 全体的にはジェネリクスも含めてよく推測されている class RbsGoose::IO::ExampleGroup < ::Array[RbsGoose::IO::Example] self.@default_examples: Hash[Symbol, RbsGoose::IO::ExampleGroup] attr_accessor error_messages: String? def self.load_from: (String base_path, ?code_dir: String, ?sig_dir: String, ?refined_dir: String) -> RbsGoose::IO::ExampleGroup def self.default_examples: () -> Hash[Symbol, RbsGoose::IO::ExampleGroup] private def self.load_example: (String base_path, String code_dir, String path, String refined_dir, String sig_dir) -> RbsGoose::IO::Example private def self.to_rbs_path: (String path, String sig_dir) -> String def to_target_group: () -> RbsGoose::IO::TargetGroup def to_refined_rbs_list: () -> Array[RbsGoose::IO::File] end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 sig/rbs_goose/io/example_group.rbs ref: OpenAI GPT-4-omni でのsig:refine 結果
  59. Evaluation 2: Failure Point 評価 2: 失敗していたポイント Failure Description: Syntax

    Error in Struct or delegator 失敗内容: Struct や def_delegator で Syntax Error class RbsGoose::Configuration LLMConfig: Struct[client: ::Langchain::LLM::Base, ... TemplateConfig: Struct[instruction: String, ... def_delegator llm, :client, :llm_client def_delegator llm, :mode, :llm_mode ... 1 2 3 4 5 6 7 sig/rbs_goose/configuration.rbs
  60. Evaluation 2: What happens when I fix it 評価 2:

    それらを直したらどうなるか
  61. Evaluation 2: Consideration 実験 2: 考察 Cannot handle RBS for

    special cases such as Struct well Necessary to include it in the example, or require Fine Tuning The 1:1 assumption of ruby and rbs was not a good rbs_rails, typeprof, etc. generate RBS at the top level I still want a fix for type errors Struct などの特殊ケースのRBS をうまく扱えない rbs_rails やtypeprof などはトップレベルにRBS を⽣成するので対応が取れない example に含めるか, Fine Tuning を⾏う必要がありそう やっぱり型エラーの⾃動修正が欲しい ruby とrbs を⼀対⼀の前提にしたのはあまり良くなかった
  62. Headline ⽬次 Purpose of creating RBS Goose How RBS Goose

    works Tips for Development with LLM Performance Evaluation of RBS Goose Conclusion RBS Goose を作った⽬的 RBS Goose の仕組み LLM を使った開発のTips RBS Goose の性能評価 まとめ
  63. Conclusion まとめ Introduced a case study of the creation of

    the RBS Goose Explained how to compose the prompt and the intent Some tips for development with LLM were presented RBS Goose is still experimental LLM could be used to do some interesting things RBS Goose を作った事例を紹介した プロンプトの構成⽅法と、その意図について解説した LLM を使った開発のTips をいくつか紹介した LLM を使うと⾯⽩いことができるかも、というのが伝わると嬉しい RBS Goose はまだ実験段階
  64. Seems to work well with AI completion AI 補完と相性が良さそう Completion

    I tried and it completes quite well. GitHub Copilot GitHub Copilot を試したら、結構補完してくれそう Editing entire projects with AI could work well like or Open Interpreter Copilot Workspace Open Interpreter や Copilot Workspace など、 AI でプロジェクト全体を編集する戦略もやりやすくなりそう
  65. I'm not sure yet whether the RBS Goose will become

    a dead duck or a goose that lays golden eggs. So I'll keep at it a little longer before I cooks my own goose. RBS Goose が Dead Duck になる ( 失敗に終わる) のか、 それとも⾦の卵を⽣むガチョウになるのかはまだわからないので、 Cook my own goose( ⾃分で成功の機会を捨てる) 前に もう少し続けてみたいと思う。