Upgrade to Pro — share decks privately, control downloads, hide ads and more …

さくらのクラウド高火力プランを使って 大規模言語モデル(LLM)を動かしてみよう

さくらのクラウド高火力プランを使って 大規模言語モデル(LLM)を動かしてみよう

オープンソースカンファレンス2023 オンライン北海道 セミナープログラム登壇資料
https://event.ospn.jp/osc2023-online-do/session/958849

Avatar for Hikaru Ashino

Hikaru Ashino

June 17, 2023
Tweet

More Decks by Hikaru Ashino

Other Decks in Programming

Transcript

  1. 2 • ୲౰ۀ຿ • αʔϏεͷόοΫΤϯυ։ൃɺΠϯϑϥ։ൃ • Ӵ੕σʔλϓϥοτϑΥʔϜʮTellusʢςϧʔεʣʯ • ͘͞ΒͷVPSɺ͘͞ΒͷΫϥ΢υ •

    ܦྺ • 2012 - 2016 ITܥઐ໳ֶߍ4೥՝ఔͰֶͿ ߴ౓ઐ໳࢜ଔ • OSC2012 SendaiͰॳΊͯOSSͷੈքΛ஌Δ • 2013 - 2016 MSPͱϗεςΟϯάΛߦ͏ձࣾʹΞϧόΠτೖࣾ • OpenStackɺLinux KVMΛ༻͍ͨԾ૝؀ڥͷߏஙӡ༻ • OSSͰߏங͞ΕͨγεςϜͷӡ༻ɺτϥϒϧγϡʔςΟϯά • 2016 - ݱ৬ ͘͞ΒΠϯλʔωοτʹ৽ଔೖࣾ • 2017 - 2019 ܳज़ܥେֶӃʹͯ2೥ݚڀ͢Δ ܳज़ֶम࢜ Twitter: @tar_xzvff
  2. ձࣾ঺հ 3 ձࣾ֓ཁ ຊࣾॴࡏ஍ େࡕ෎େࡕࢢ๺۠കా1-12-12 ౦ژݐ෺കాϏϧ 11F (2021೥10݄Ҡస) ૑ۀ೥݄೔ 1996೥12݄23೔

    ʢձࣾઃཱ: 1999೥8݄17೔ʣ ্৔೥݄೔ 2005೥10݄12೔ʢϚβʔζʣ 2015೥11݄27೔ʢ౦ূҰ෦ʢݱϓϥΠϜ ࢢ৔ʣ΁ࢢ৔มߋʣ ࢿຊۚ 22ԯ5,692ສԁ ैۀһ਺ ࿈݁ 710໊ʢ2022೥3݄຤ʣ άϧʔϓձࣾ ΞΠςΟʔΤϜגࣜձࣾ גࣜձࣾS2i ᓎՖҠಈి৴༗ݶެ࢘ ήώϧϯגࣜձࣾ Ϗοτελʔגࣜձࣾ ϓϥφειϦϡʔγϣϯζגࣜձࣾ IzumoBASEגࣜձࣾ BBSakura Networksגࣜձࣾ ΠϯλʔωοτΠϯϑϥͷఏڙΛओͳࣄۀ಺༰ͱ͠·ͯ͠ɺ େࡕɺ౦ژɺੴङͷ3஍Ҭʹ5ͭͷσʔληϯλʔΛల։
  3. ձࣾ঺հ 4 VPSɾΫϥ΢υ σʔληϯλʔ ৽αʔϏε Ծ૝Խٕज़Λ༻͍ɺ 1୆ͷ෺ཧαʔό্ ʹෳ਺ͷԾ૝αʔό Λߏங͠ɺԾ૝ઐ༻ αʔόͱͯ͠෼͚ͨ

    ྖҬͷ઎༗αʔϏε ߴੑೳαʔόͱ֦ு ੑͷߴ͍ωοτϫʔ ΫΛѹ౗తͳίετ ύϑΥʔϚϯεͰར ༻Ͱ͖ΔIaaSܕύϒ ϦοΫɾΫϥ΢υɾ αʔϏε ߴੑೳͰ֦ுੑͱ৴པੑͷߴ ͍αʔόΛ·Δ͝ͱಠ઎ͯ͠ ར༻͢Δ͜ͱ͕Ͱ͖ɺࣗ༝ʹ ΧελϚΠζͯ͠ར༻Մೳͳ αʔϏε ϋ΢δϯά ϦϞʔτϋ΢δϯά σʔληϯλʔ಺ʹ͓٬༷ઐ ༻ͷϋ΢δϯάεϖʔεΛ֬ อ͠ɺωοτϫʔΫػث΍ αʔόͳͲͷػࡐΛࣗ༝ʹஔ ͚ΔαʔϏε ػցֶशɺσʔλղੳɺߴਫ਼౓γϛϡϨʔγϣϯ༻్ʹಛԽͨ͠GPU౥ࡌͷ ઐ༻αʔόαʔϏε ઐ༻αʔό ͘͞ΒͷηΩϡΞϞόΠϧίωΫτ Ϋϥ΢υʹμΠϨΫτʹ઀ଓ͠ɺηΩϡΞͰ͋Γͭͭ೚ҙͷωοτϫʔΫ΁ ઀ଓՄೳͳSIMΛఏڙ͢ΔɺIoT޲͚ϞόΠϧαʔϏε ̖̞ ਓ޻஌ೳ IoT Ϩϯλϧαʔό 1୆ͷαʔόΛෳ਺ͷܖ໿ऀ ͰαʔόΛڞ༗·ͨ͸઎༗͢ Δ͜ͱ͕Ͱ͖ɺ؅ཧ͸͘͞Β Πϯλʔωοτʹ೚ͤͯ࢖͏ αʔϏε ઐۀͰશํҐʹύϒϦοΫͳαʔϏεΛఏڙ͍ͯ͠Δࠃ಺།Ұͷࣄۀऀ͔ͩΒͦ͜૊Έ߹Θͤͯબ୒ࢶ͕޿͕ΔɺͦΕ͕ʮ͘͞ΒΠϯλʔωοτʯͷڧΈ ೥݄*4."1औಘ ΠϯλʔωοτΠϯϑϥͷఏڙΛࣄۀυϝΠϯʹɺ େࡕ/౦ژ/ੴङʹσʔληϯλʔΛల։ɻ େࡕ/౦ژ/ੴङΛ100GbpsͰ݁ͼͭͭɺ ର֎઀ଓͷ૯ܭ͸ 1.84Tbps ͷωοτϫʔΫͰ ࠃ಺ͷΠϯλʔωοττϥϑΟοΫΛࢧ͍͑ͯ·͢ɻ (2023೥4݄ݱࡏ) ੴङσʔληϯλʔ3߸౩ʢӈଆʣ ੴङσʔληϯλʔશܠ Ӊ஦ Tellus(ςϧʔε)͸ɺ೔ຊൃͷӴ੕σʔλϓϥοτϑΥʔϜͰ͢ɻ Ӵ੕σʔλͷఏڙΛ͸͡Ίͱ͠ɺσʔλΛར༻ͨ͠ ৽ͨͳϏδωεΛ૑ग़͢Δ؀ڥΛ͝༻ҙ͍ͯ͠·͢ɻ
  4. ʮ͘͞ΒͷΫϥ΢υߴՐྗϓϥϯʯ֓ཁ 7 ͘͞ΒͷΫϥ΢υͰར༻Ͱ͖ΔGPUαʔόϓϥϯ ϋΠ パ ϑΥʔϚϯεͳ GPUʮNVIDIA V100ʯΛॳظඅ༻ෆཁ で 1

    ࣌ؒ୯Ґ͔ΒΫϥ΢ ド ͷ࢖͍উखͦͷ··ʹར༻ で ͖Δαʔ ビ ε εϖοΫ ఏڙκʔϯ ੴङୈ̍κʔϯ CPU 4vCPU ϝϞϦ 56GB GPUΧʔυ NVIDIA V100 (32GB) x 1 GPU౥ࡌϝϞϦ 32GB ༻్ྫ: ػցֶशɺσΟʔϓϥʔχϯάɺHPC
  5. ओͳେن໛ݴޠϞσϧ(LLM)ʹ͍ͭͯ • GPT(Generative Pre-trained Transformer) • OpenAPI͕։ൃͨ͠ϓϦτϨʔχϯάϞσϧ • https://huggingface.co/gpt2 •

    ChatGPT͸͜ΕΛ࢖͍ͬͯΔ(GPT3,GPT3.5,GPT4) • BERT(Bidirectional Encoder Representations from Transformers) • Google͕։ൃͨ͠ϓϦτϨʔχϯάϞσϧ • https://huggingface.co/bert-base-uncased • OPT(Open Pre-trained Transformer) • META͕։ൃͨ͠ϓϦτϨʔχϯάϞσϧ • https://huggingface.co/facebook/opt-13b 18
  6. Φʔϓϯιʔεେن໛ݴޠϞσϧ(LLM)ʹ͍ͭͯ • Dolly https://huggingface.co/databricks/dolly-v2-12b • Datablicks͕ࣾެ։ 120,70,30ԯύϥϝʔλ • ঎༻ར༻Մೳ •

    OpenLLaMAɹhttps://huggingface.co/openlm-research/open_llama_7b • OpenLM Research͕ެ։ 130(600ԯτʔΫϯͰֶश),70,30ԯύϥϝʔλɺMeta AI ͷ LLaMA ͷΦʔϓϯιʔε࣮૷ • ঎༻ར༻Մೳ • Falcon https://huggingface.co/tiiuae/falcon-40b • Technology Innovation Institute͕ެ։ 400,70ԯύϥϝʔλ • ঎༻ར༻Մೳ • StableLM https://huggingface.co/stabilityai/stablelm-tuned-alpha-7b • Stability AI͕ެ։ 70,30ԯύϥϝʔλ • ঎༻ར༻Մೳ • OpenCALM l https://huggingface.co/cyberagent/open-calm-7b • CyberAgent͕ެ։ͨ͠೔ຊޠେن໛ݴޠϞσϧɻ࠷େ68ԯύϥϝʔλ • ঎༻ར༻Մೳ • ೔ຊޠGPT-2/BERTͷࣄલֶशϞσϧ https://huggingface.co/rinna • rinna͕ࣾެ։ • MPT-7B https://huggingface.co/mosaicml/mpt-7b • MosaicML͕ެ։ɺ70ԯύϥϝʔλɻ௕จΛಘҙͱ͍ͯ͠Δ • ঎༻ར༻Մೳ 19
  7. ͘͞ΒͷΫϥ΢υͰେن໛ݴޠϞσϧΛಈ͔ͨ͢Ίͷखॱ • ࠓճ͸CyberAgent͕ެ։͍ͯ͠ΔOpenCALMΛಈ͔ͯ͠Έ·͢ • OpenCALM • ਺গͳ͍೔ຊޠେن໛ݴޠϞσϧͷҰͭ • Hugging Face※Ͱެ։͞Ε͍ͯΔ

    21 ※ػցֶशϞσϧΛެ։ɺڞ༗Ͱ͖ΔϓϥοτϑΥʔϜ αΠόʔΤʔδΣϯτɺಠࣗͷ೔ຊޠLLMʢେن໛ݴޠϞσϧʣΛ։ൃ ―ࣗવͳ೔ຊޠͷจষੜ੒Λ࣮ݱ― https://www.cyberagent.co.jp/news/detail/id=28797
  8. ͘͞ΒͷΫϥ΢υͰେن໛ݴޠϞσϧΛಈ͔ͨ͢Ίͷखॱ̏ • GPUαʔόͷ࡞੒ • αʔό௥Ճը໘Ͱ֤߲໨Λબ୒ 25 ߲໨໊ ஋ αʔόϓϥϯ GPUϓϥϯ

    σΟεΫ-σΟεΫϓϥϯ SSDϓϥϯ σΟεΫ-σΟεΫιʔε ΞʔΧΠϒ ΞʔΧΠϒબ୒ Ubuntu 22.04.1 LTS σΟεΫ-σΟεΫαΠζ 100GB σΟεΫमਖ਼-σΟεΫमਖ਼Λ͢Δ νΣοΫ ؅ཧϢʔβͷύεϫʔυ ೚ҙͷύεϫʔυΛೖྗ ϗετ໊ɺެ։伴 ※ඞཁʹԠͯ͡ೖྗɾબ୒ αʔόͷ৘ใ-໊લ Θ͔Γ΍͍͢೚ҙͷ໊લ ࡞੒਺ 1
  9. ͘͞ΒͷΫϥ΢υͰେن໛ݴޠϞσϧΛಈ͔ͨ͢Ίͷखॱ̒ • GPUυϥΠόͷΠϯετʔϧ • NVIDIAࣾͷυΩϡϝϯτ௨ΓʹΠϯετʔϧΛߦ͍·͢ • https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html 28 $ sudo

    apt-get install linux-headers-$(uname -r) $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID | sed -e 's/\.//g') $ wget https://developer.download.nvidia.com/compute/cuda/repos/$distribution/x86_64/cuda-keyring_1.0-1_all.deb $ sudo dpkg -i cuda-keyring_1.0-1_all.deb $ sudo apt-get update $ sudo apt-get -y install cuda-drivers
  10. ͘͞ΒͷΫϥ΢υͰେن໛ݴޠϞσϧΛಈ͔ͨ͢Ίͷखॱ̔ • αϯϓϧίʔυͷ࣮ߦ • Hugging FaceͰެ։͞Ε͍ͯΔɺOpenCALMͷαϯϓϧίʔυΛಈ͔͠·͢ • https://huggingface.co/cyberagent/open-calm-7b 30 import

    torch from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("cyberagent/open-calm-7b", device_map="auto", torch_dtype=torch. fl oat16) tokenizer = AutoTokenizer.from_pretrained("cyberagent/open-calm-7b") inputs = tokenizer("AIʹΑͬͯࢲୡͷ฻Β͠͸ɺ", return_tensors="pt").to(model.device) with torch.no_grad(): tokens = model.generate( **inputs, max_new_tokens=64, do_sample=True, temperature=0.7, top_p=0.9, repetition_penalty=1.05, pad_token_id=tokenizer.pad_token_id, ) output = tokenizer.decode(tokens[0], skip_special_tokens=True) print(output)
  11. େن໛ݴޠϞσϧΛηϧϑϗετ͢ΔϝϦοτ • ಛఆλεΫΛ࣮ߦ͢ΔͨΊʹΧελϚΠζͨ͠LLMΛ࢖͍͍ͨχʔζ • λεΫݻ༗ͷσʔληοτͰϑΝΠϯνϡʔχϯά͢Δ • σʔληΩϡϦςΟͱϓϥΠόγʔ • ֎෦ʹग़ͤͳ͍σʔλΛ࢖ͬͨLLMͷ׆༻ •

    σʔλ࿙ӮͳͲͷ؍఺͔Β • ΦϯϓϨϛεͷԸܙ • ΋͠େن໛ݴޠϞσϧΛಈ͔ͤΔΠϯϑϥ͕͋ΔͷͰ͋Ε͹ɺैྔ՝ۚͳͲؾʹ ͤͣʹϞσϧΛಈ͔͢͜ͱ͕Ͱ͖Δɺίετίϯτϩʔϧ • ࠷దԽ(ࣗ૊৫಺ʹஔ͘ͷͰɺԠ౴͕஗ͯ͘΋վળͰ͖ΔՄೳੑ͕͋Δ) • ͨͩ͠ిؾ୅͕͍͢͜͝ͱʹͳΓͦ͏💸 39
  12. ײ૝ • ಈ͔͢·Ͱͷखॱ͕γϯϓϧͰ͙͢ʹಈ͔ͤͨ • GPUυϥΠόͱPythonϥΠϒϥϦͷΠϯετʔϧ͚ͩ • ࣗ༝౓͕ߴͦ͏ɺ൚༻తʹ૊ΈࠐΜͩΓͰ͖ͦ͏ • HTTP APIͱͯ͠ଞαʔϏεͱܨ͍ͩΓɺslack

    botʹͨ͠Γ • ΋͏͢͜͠ਫ਼౓͕΄͍͠ • Ԡ౴͕ͪΐͬͱظ଴͸ͣΕͷ࣌΋͋Δ • ϑΝΠϯνϡʔχϯάͯ͠Έ͍ͨ • ۀ຿Ͱ׆༻ͯ͠Έ͍ͨ • ࣗࣾΫϥ΢υͰ΋ಈ͘͜ͱ͕෼͔Γخ͍͠ • ҟͳΔGPUͰ΋ࠓޙݕূͯ͠Έ͍ͨͰ͢ • LLMΛ׆༻ͯ͠Έ͍ͨɺਂ۷Γͯ͠Έ͍ͨɺ࡞ͬͯΈ͍ͨ • ·ͣ͸ػցֶशΛ͸͡Ί͔Βֶͼ͍ͨͱࢥ͍·͢ 40
  13. ࠂ஌ • 6/24(౔) Open Source Conference 2023 Hokkaido (లࣔ) 43

    GPUαʔόʔΛ࣌ؒ୯ҐͰར༻Մೳͳʮ͘͞ΒͷΫϥ΢υߴՐྗϓϥϯʯΛ͝঺հ͠·͢ɻ ͋Θͤͯɺάοζ΍Ϋʔϙϯͷ഑෍΋ߦ͍·͢ɻ ࠓճͷσϞ΋లࣔ͠·͢ʂ https://ospn.connpass.com/event/285754/
  14. ͘͞ΒΠϯλʔωοτͰ͸Τϯ ジ χΞ࠾༻ΛڧԽ͍ͯ͠·͢ ͘͞ΒΠϯλʔωοτ͸৽ͨͳΞΠ デ Ξͷ૑ग़ʹڧ͍೤ҙͱ৘೤Λ࣋ͬͯ௅ઓ͢Δ͓٬༷Λ͸ じ Ίɺࢲͨͪͱͭͳ が Γͷ͋Δ͢

    べ ͯͷਓͨͪͷͨΊʹɺະདྷͷ͋Δ べ ͖࢟Λ૝͍ඳ͖ͳ が Β ― ʮ΍Γ͍ͨ͜ͱʯΛʮ で ͖Δʯʹม͑Δ ― ͋ΒΏΔΞ プ ϩʔνΛ “Πϯλʔωοτ”Λ௨ じ ͯఏ ڙ͠·͢ɻ SAKURA internet ࣾձΛࢧ͑Δ 
 ύϒϦοΫΫϥ΢υΛ 
 Ұॹʹ࡞Γ·ͤΜ͔ʁ ιϑτ΢ΣΞ։ൃɺ Πϯϑϥج൫͔Β ϑϩϯτΤϯυ·Ͱ ࠾༻ڧԽத! ৄ͘͠͸WebαΠτʹͯɺΧδϡΞϧ໘ஊ΋΍ͬͯ·͢ 👉 www.sakura.ad.jp/lp/22engineer/