Upgrade to Pro — share decks privately, control downloads, hide ads and more …

メールの分類をLLMをつかってやってみた

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for ramo798 ramo798
May 23, 2024
88

 メールの分類をLLMをつかってやってみた

Avatar for ramo798

ramo798

May 23, 2024
Tweet

Transcript

  1. やったこと tokenizer = AutoTokenizer.from_pretrained("cl-tohoku/bert-base-japanese-v3") encoding = tokenizer(df_email['Content'].tolist(), padding=True, truncation=True, return_tensors="pt",

    max_length=256) model = AutoModelForSequenceClassification.from_pretrained("cl-tohoku/bert-base-japanese-v3", num_labels=len(df_email['question_id'].unique())) # 訓練用のデータの準備等(省略) model.train()