会社紹介:Sakana AI
Llion Jones, CTO
元 Google
”Transformer” 論文著者
伊藤 錬, COO
Co-founders
David Ha, CEO
元 Google Brain Tokyo Head
Slide 3
Slide 3 text
会社紹介:Sakana AI
モデルを大きく
データを大きく
学習時間を大きく
他にもやるべき
ことがあるはず
Slide 4
Slide 4 text
会社紹介:Sakana AI
“The core research focus of Sakana AI is in applying
nature-inspired ideas, such as evolution and collective intelligence,
to improve foundation models’ performance”
• 同様に、重みをelement-wiseでマージする手法がいくつかある
• 大体、線形補間のちょい発展版みたいなもんだと思っておけば一旦OK
[2203.05482] Model soups: averaging weights of multiple fine-tuned models improves accuracy
without increasing inference time
[2212.04089] Editing Models with Task Arithmetic
[2306.01708] TIES-Merging: Resolving Interference When Merging Models
[2311.03099] Language Models are Super Mario: Absorbing Abilities from Homologous Models as a
Free Lunch
[2403.19522] Model Stock: All we need is just a few fine-tuned models
アプローチ1
重みレベルのモデルマージ
人は何を求めてモデルをマージする?
②複数の能力を統合
[2311.03099] Language Models are Super Mario:
Absorbing Abilities from Homologous Models as a
Free Lunch
[2310.04799] Chat Vector: A Simple Approach to Equip
LLMs with Instruction Following and Model Alignment
in New Languages
Methods
MAP-Elitesをベースに
3つの鍵となるアイディア
差分 #1
Alternating Quality (Q) and Behavior
Characteristics (BCs) in each generation
差分 #2
Model merging as crossover
Illustration of model merging
差分 #3
SVD-based mutation
Illustration of SVD-based mutation