pretrained model without accessing its internal weights, by steering it using an “ expert” (a small tuned model) and its corresponding “ anti-expert” (the small model, untuned). The difference between the predicted logits of the expert and the anti-expert is applied as an offset on the original logits from the base LLMの Finetuningをし たい 大変なので サボりたい
pretrained model without accessing its internal weights, by steering it using an “ expert” (a small tuned model) and its corresponding “ anti-expert” (the small model, untuned). The difference between the predicted logits of the expert and the anti-expert is applied as an offset on the original logits from the base LLMの Finetuningをし たい 大変なので サボりたい まず小さいLM をFinetuning 元モデルとの 差分を記録 LLMのlogitを 補正 Finetuningと同 じ効果を 期待
anti-expert” (the sma the expert and the ant model, to guide it in the scale. The logits show LLA M A 2-7B (from top Figure 1: Proxy-tuning “ tunes” a large weights, by steering it using an “ expert “ anti-expert” (the small model, untuned the expert and the anti-expert is applied Figure 1: Proxy-tuning “ tunes” a large pretrained model without ac weights, by steering it using an “ expert” (a small tuned model) and ポイント① 補正は毎時刻 必要 ポイント② 語彙の一致を 仮定 これのfinetuning を サボりたい
pretrained model without accessing its internal weights, by steering it using an “ expert” (a small tuned model) and its corresponding “ anti-expert” (the small model, untuned). The difference between the predicted logits of the expert and the anti-expert is applied as an offset on the original logits from the base LLMの Finetuningをし たい 大変なので サボりたい まず小さいLM をFinetuning 元モデルとの 差分を記録 LLMのlogitを 補正 Finetuningと同 じ効果を 期待