Slide 41
Slide 41 text
# t.py ͱͯ͠อଘ
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
encoded = tokenizer("We are very happy to show you the 🤗 Transformers
library.")
print(encoded)
࣮ߦ݁Ռ
Downloading tokenizer_config.json: 100%|█| 48.0/48.0 [00:00<00:00,
Downloading config.json: 100%|████| 570/570 [00:00<00:00, 4.90MB/s]
Downloading vocab.txt: 100%|█████| 232k/232k [00:00<00:00, 745kB/s]
Downloading tokenizer.json: 100%|█| 466k/466k [00:00<00:00, 1.03MB/
{'input_ids': [101, 2057, 2024, 2200, 3407, 2000, 2265, 2017, 1996, 100,
19081, 3075, 1012, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}
ࢀߟจࣈྻͷτʔΫϯԽ