Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
ANNとナイーブベイズを使った雑な野球選手の成績予測 / Baseball player p...
Search
Shinichi Nakagawa
PRO
July 22, 2020
Research
0
3.1k
ANNとナイーブベイズを使った雑な野球選手の成績予測 / Baseball player performance prediction with Python
PyCon JP 2020で話す予定の話のダイジェストです.
kawasaki.rb #86 での練習試合.
#Python #DataScience #MLB #Baseball
Shinichi Nakagawa
PRO
July 22, 2020
Tweet
Share
More Decks by Shinichi Nakagawa
See All by Shinichi Nakagawa
生成AIを活用した野球データ分析 - メジャーリーグ編 / Baseball Analytics for Gen AI
shinyorke
PRO
1
550
ゼロから始めるSREの事業貢献 - 生成AI時代のSRE成長戦略と実践 / Starting SRE from Day One
shinyorke
PRO
0
320
AI・LLM事業部のSREとタスクの自動運転
shinyorke
PRO
0
430
実践Dash - 手を抜きながら本気で作るデータApplicationの基本と応用 / Dash for Python and Baseball
shinyorke
PRO
2
3.2k
Terraform, GitHub Actions, Cloud Buildでデータ基盤をProvisioningする / Data Platform provisioning for Google Cloud and Terraform
shinyorke
PRO
2
3.3k
Cloud RunとCloud PubSubでサーバレスなデータ基盤2024 with Terraform / Cloud Run and PubSub with Terraform
shinyorke
PRO
9
4.1k
自らを強いエンジニアにするための3つの習慣 / I need to be myself, I can't be no one else
shinyorke
PRO
85
86k
阪神タイガース優勝のひみつ - Pythonでシュッと調べた件 / SABRmetrics for Python
shinyorke
PRO
1
1.4k
Pythonとクラウドと野球の推し活. / Baseball Data Platform for Python and Google Cloud
shinyorke
PRO
2
3k
Other Decks in Research
See All in Research
A scalable, annual aboveground biomass product for monitoring carbon impacts of ecosystem restoration projects
satai
3
140
20250605_新交通システム推進議連_熊本都市圏「車1割削減、渋滞半減、公共交通2倍」から考える地方都市交通政策
trafficbrain
0
670
診断前の病歴テキストを対象としたLLMによるエンティティリンキング精度検証
hagino3000
1
120
数理最適化に基づく制御
mickey_kubo
6
710
Delta Airlines® Customer Care in the U.S.: How to Reach Them Now
bookingcomcustomersupportusa
0
110
Type Theory as a Formal Basis of Natural Language Semantics
daikimatsuoka
1
270
EOGS: Gaussian Splatting for Efficient Satellite Image Photogrammetry
satai
4
390
LLM-as-a-Judge: 文章をLLMで評価する@教育機関DXシンポ
k141303
3
850
一人称視点映像解析の最先端(MIRU2025 チュートリアル)
takumayagi
6
3.1k
Generative Models 2025
takahashihiroshi
23
13k
2025/7/5 応用音響研究会招待講演@北海道大学
takuma_okamoto
1
160
Computational OT #4 - Gradient flow and diffusion models
gpeyre
0
350
Featured
See All Featured
Unsuck your backbone
ammeep
671
58k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
47
9.6k
Scaling GitHub
holman
461
140k
Code Reviewing Like a Champion
maltzj
524
40k
Designing for Performance
lara
610
69k
Building a Modern Day E-commerce SEO Strategy
aleyda
43
7.4k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.8k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
3.9k
Intergalactic Javascript Robots from Outer Space
tanoku
272
27k
Measuring & Analyzing Core Web Vitals
bluesmoon
8
550
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.1k
Transcript
ٿબखͷ༧ଌϞσϧΛ ͍͍ײ͡ʹ࡞ͬͯΈͨVer 1.0 Shinichi Nakagawa (@shinyorke) kawasaki.rb #86 7पͪΐͬͱLTେձ
Who am I ? • Shinichi Nakagawa(@shinyorke) • JX௨৴ࣾγχΞɾΤϯδχΞ •
࠷ۙͣͬͱσʔλج൫ɾσʔλੳ͍ͯ͠ΔϚϯ • ຊདྷٿσʔλαΠΤϯεʹڧ͍ਓ • ٕज़ސ͡Ί·ͨ͠
kwsk.pyҊ݅Ͱ͢ :bow: PyCon JPʹ2ͿΓ6ճͷ⽁Λ͢Δ͜ͱʹͳΓ·ͯ͠. ٱ͠ͿΓʹนଧͪʹͬͯ·͍Γ·͓͖ͨ͠߹͍͍ͩ͘͞⽁
ʲਤʳࠓճ͖ͬͯͨ͜ͱ ຊ֨తͳ։ൃ4݄͔Β, ࠷ޙͷλεΫ͕௨ͬͨͷ͕͍ͭ࠷ۙ اըɾߏؚΊΔͱ࣮͍ۙϓϩδΣΫτͩͬͨΓ
None
σʔληοτ࡞ɾಛྔநग़ • ϝδϟʔϦʔάͷσʔλʮSean Lahmanʯʮretrosheetʯ ͜ΕΒΛͯ͢BigQueryʹimport • CSV͔Βςʔϒϧ࡞ • ػցֶशλεΫʹඞཁͳಛྔΛ۪ʹࢉग़
ػցֶशλεΫͦͷᶃ ʮࣅ͍ͯΔબखΫϥελΛ࡞Δʯ
कඋҐஔɾͷงғؾͰΫϥελϦϯά • ࡶʹݴ͏ͱ, ʮ˓˓ͬΆ͍બखϥϯΩϯάʯΛ࡞Δ • ྫ͑ࡔຊ༐ਓʢڊਓʣͬΆ͍બखʁͱݴΘΕͨΒ, ʮकඋҐஔ͕γϣʔτʯʮৗʹ3ׂ20ຊྥଧଧͭʯ ͱ͔ͦΜͳײ͡. γϣʔτͰ͋Δ͜ͱϚετ, ͋ͱଧܸ࣍ୈ.
• ଧܸ͓ΑͼҰ෦ͷकඋࢦඪΛͬͯϢʔΫϦουڑΛ ٻΊͯ૯ΓͰ֤બखͷʮͦΕͬΆ͍ϥϯΩϯάʯΛ࡞Εͦ͏.
ۙࣅ࠷ۙ୳ࡧʢANNʣͰͬͯΈͨ • kNNͱ͔k-meansͱ͔Γํ৭ʑ͚͋ͬͨͲANNͰͬͨ݁Ռ ͕͍͖ͳΓ͍͍ײͩͬͨ͡ͷͰ͜Εʹͨ͠. • ANNͷλεΫAnnoy͍ͬͯ͏ϥΠϒϥϦͰര։ൃ. • ϝδϟʔϦʔΨʔ19,000ਓͷσʔλͰͬͨΒ͍͍ײ͡ʹ.
ίʔυʢҰ෦ൈਮʣ˞ಛྔൿີ ֶश͔ΒϞσϧอଘͨͬͨ͜Ε͚ͩ. σʔλେ͖͘ͳ͍ͷͰඵͰऴΘΓ·ͨ͠.
ϚοτɾνϟοϓϚϯʢMLBएखࡾྥखʣʹ͍ۙબख ٬؍తͳσʔλ͔Β, ϑΝϯͱͯͬͯ͠Δͱͯ͠. ͍ۙબख͕ͪΌΜͱू·Γ·ͨ͠, શһࡾྥखͰଧܸͰ݁Ռग़ͤΔϚϯͳͷͰจ۟ͳ͠ʂ
ࣅ͍ͯΔબखूΊʹޭ ʢଞͷϙδγϣϯ͍͍ײͩͬͨ͡ʣ ޙ͔ͬ͜ΒߋʹΧςΰϦʔྨͯ͠ ʮະདྷͷΛ࡞ΓࠐΉʯ ࣄ͕Ͱ͖ͨΒʂ
ػցֶशλεΫͦͷᶄ ʮಉ͡ΧςΰϦͷબखΛݟ͚ͭΔʯ
φΠʔϒϕΠζʹΑΔΧςΰϦʔ͚ • ࣗવݴޠॲཧͷྨλεΫΈ͍ͨͳղ͖ํͰͬͯΈͨ. • ީิʮφΠʔϒϕΠζʯʮϥϯμϜϑΥϨετʯ͋ͨΓ. ࠓճφΠʔϒϕΠζͰͬͨ. • ٿʹ͓͚Δ౷߹తͳೳྗࢦඪʮOPSʯΛ͝ͱͷΧςΰϦʔʹ͚, ͍͔ͭ͘ͷଧܸࢦඪΛϕΫτϧʹ࣮ͯ͠ࢪ. •
࣮ී௨ʹscikit-learnͱPandasͰΓ·ͨ͠.
ͬͨ͜ͱʢཁʣɹ˞ࡶʹॻ͍ͯ·͢ • ֶशσʔλ • ༧ଌ͍ͨ͠બखʹࣅͨબख50ਓͷΛϐοΫΞοϓ • ಛྔൿີͰ͕͢…ී௨ͷଧܸʹӅ͠ຯগʑ • ༧ଌσʔλ •
༧ଌ͍ͨ͠બखͷಛྔ • ݁Ռͷϥϕϧσʔλ • OPSΛ5ஈ֊ͷΧςΰϦʹͨ͠ͷ(1ʙ5) • ্هͰࢦఆͨ͠ΧςΰϦʹଐ͢Δબखͷྸผฏۉ͔ΒͦΕͬΆ͍Λग़͢
༧ଌͱҰॹʹݟͯΈ·͠ΐ͏͔.
ϚοτɾνϟοϓϚϯʢݱ࣮ͷʣ 24ʙ26ࡀʢڈ·Ͱʣͷ. ༧ଌ͍ͨ͠ͷ27ʙ29ࡀͷ.
ϚοτɾνϟοϓϚϯʢ༧ଌ͖ʣ 27ࡀҎ߱ͷΛ༧ଌͨ݁͠ՌΛؚΊͨάϥϑ.
None
ग़͖ͯͨ݁ՌΛ͡Δͱ… • ൺֱత, ݱ࣮ʹଈͯ͠ΔͬΆ͍݁ՌʹͳΓ·ͨ͠. • ʮ28ࡀͷ͕Maxʯʮ29ࡀ͔ΒԼ͕ͬͯΔʯͨΓ͕ϦΞϧ. ※ΞεϦʔτͷମత࠷ߴை26ʙ28ࡀͱݴΘΕ͍ͯ·͢ • ͱ͍͑28ࡀͷຊྥଧ্͕͕ͬͯΔͷ, ͳΜ͔ո͍͠.
͓ͦΒ͘୭͔ͷʹҾͬுΒΕ͍ͯΔ.
Γ͠ɾվળϙΠϯτ • ࠷ޙͷྨ, ϕΠζҎ֎ࢼ͍ͨ͠. • ʮ28ࡀΛʹਰ͑ΔʯϙδγϣϯʹΑͬͯҧ͏આ͋Δ. ͷͰʮ্ͷʯΛٻΊΔλεΫ͕͍͍͔͋ͬͯ. • 2020ͷϝδϟʔϦʔάྫͷͷࢼ߹ͳͷͰ, ༧ଌͦ͜ʹ߹Θ͍ͤͨʢ2ͰׂͬͯऴΘΔʁwʣ
• ͱ͍͏ͷ͕PyCon JP 2020·ͰʹͰ͖ͯΔͣʢVer. 25ʮTsurageʯͰʣ
ଓ͖PyCon JP 2020Ͱʂ #͓͠·͍ #͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠ Shinichi Nakagawa(Twitter/Facebook/etc… @shinyorke)