Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
輪講_Kaggleで勝つデータ分析の技術_第2章
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Yust0724
April 15, 2020
Science
160
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
輪講_Kaggleで勝つデータ分析の技術_第2章
輪講用のまとめ
Yust0724
April 15, 2020
More Decks by Yust0724
See All by Yust0724
2019 Data Science Bowl competition solution
yust0724
0
110
輪講_Kaggleで勝つデータ分析の技術_第5章
yust0724
0
99
Other Decks in Science
See All in Science
1. CPC理論の展開と集合的知能モデル(JSAI2026 KS-27 集合的予測符号化と新たな知性の時代)
hayashiyus884
1
210
防災デジタル分野での官民共創の取り組み (1)防災DX官民共創をどう進めるか
ditccsugii
0
670
機械学習 - 授業概要
trycycle
PRO
0
540
明治薬科大学講義_ビッグデータ解析を支えるデータベース技術とクラウドコンピューティング
ktatsuya
1
110
データベース04: SQL (1/3) 単純質問 & 集約演算
trycycle
PRO
0
1.5k
機械学習 - DBSCAN
trycycle
PRO
0
1.9k
生成AIの現状と展望
tagtag
PRO
0
140
How we plan to publish 1,000 bio-logging datasets to GBIF and OBIS
peterdesmet
0
110
ハミルトン・ヤコビ方程式の解の性質と物理的意味
enakai00
0
700
なぜ21は素因数分解されないのか? - Shorのアルゴリズムの現在と壁
daimurat
0
460
Non-Gaussian, nonlinear causal discovery with hidden variables and application
sshimizu2006
0
140
SpatialRDDパッケージによる空間回帰不連続デザイン
saltcooky12
0
250
Featured
See All Featured
What's in a price? How to price your products and services
michaelherold
247
13k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
300
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
38
2.9k
Facilitating Awesome Meetings
lara
57
7k
Leo the Paperboy
mayatellez
7
1.9k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
850
4 Signs Your Business is Dying
shpigford
187
22k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
250
1.3M
Are puppies a ranking factor?
jonoalderson
1
3.7k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
The Director’s Chair: Orchestrating AI for Truly Effective Learning
tmiket
1
200
Measuring Dark Social's Impact On Conversion and Attribution
stephenakadiri
2
220
Transcript
ୈ2ষɹλεΫͱධՁࢦඪ 2020/04/15 Yu Sato @Yust ୈ4ճྠߨ
ࣗݾհ ▪ 20192݄͜Ζ͔ΒkaggleʹࢀՃɻ ▪ ͖ɿΫϩɺ kaggleɺαφɺ͏ͳ͗ ▪ ؾʹͳΔɿ͋ͭ
KaggleͰউͭσʔλੳͷٕज़_࣍ 3 ୈ1ষɹੳίϯϖͱ? ୈ2ষɹλεΫͱධՁࢦඪɹˡࠓίί ୈ3ষɹಛྔͷ࡞ ୈ4ষɹϞσϧͷ࡞ ୈ5ষɹϞσϧͷධՁ ୈ6ষɹϞσϧͷνϡʔχϯά ୈ7ষɹΞϯαϯϒϧ
ϞσϧͷධՁͱʁ 4 test model ϞσϧͷධՁ2ͭ͋ΓɺͦΕͧΕͷࢦඪΛʮతؔʯʮධՁࢦඪʯͱ͍͏ɻ fit predict evaluate train Leaderboard
f(x)Λ ܭࢉ͢Δ f(x):తؔ ύϥϝʔλ update f(x)͕࠷খͱ ͳΔ·Ͱloop f(x)Λ ܭࢉ͢Δ f(x):ධՁࢦඪ … తؔϞσϧʹґଘ ධՁࢦඪίϯϖʹґଘ p.87
۩ମతͳؔ 5 λεΫͷछྨʹΑͬͯ༻͞ΕΔؔҟͳΔɻ ճؼ ೋ ྨ ଞΫ ϥε ྨ ɾRMSE
ɾlogloss ɾmulti-class logloss ճؼ ೋ ྨ ଞΫ ϥε ྨ ɾRMSE ɾRMSLE ɾMAE ɾR2(ܾఆ) ɾaccuracy ɾerror rate ɾF1-score, Fβ-score ɾlogloss ɾAUC ɾmulti-class accuracy ɾmean-F1, macro-F1, micro-F1 ɾquadratic weighted kappa(QWK) ɾmulti-class logloss తؔ ධՁࢦඪ p.62~83
తؔͱධՁࢦඪͷؔ 6 ɾRMSE ɾlogless ɾmulti-class logloss 1.ධՁࢦඪͱಉ ؔ͡Λతؔ ʹઃఆՄೳ 2.ධՁࢦඪΛ
తؔʹֶత ʹมՄೳ 3.ධՁࢦඪ͕0/ 1Λ༧ଌ͢Δ ྨλεΫ ɾRMSE ɾlogless ɾmulti-class logloss ɾ(logม)RMSE ɾRMSE ɾlogless ɾmulti-class logloss ɾaccuracy ɾerror rate ɾF1-score, Fβ-score ɾFair ɾqwkΛ࿈ଓؔۙࣅ ɾRMSLE 4.ධՁࢦඪͷྨ ࣅؔΛతؔ ʹઃఆՄೳ ɾMAE ɾQWK తؔ ٯlogม ᮢͰྨ 0.23 0.88 0.67 0.12 0 1 1 0 (0.60) ͦͷ·· ΄΅ͦͷ·· ධՁࢦඪ େ͖͚ͯ͘ҎԼͷ4ύλʔϯ͕͋Δɻ ৄ͘͠ղઆ ৄ͘͠ղઆ p.90
ᮢͲ͏ͬͯಋग़͢Δʁ 7 trainͰ࠷దԽ͞ΕͨᮢΛٻΊɺͦΕΛ༻͍ͯtestΛྨɻ threshold 0.00 ~ 0.80 → 0 0.81
~ 1.80 → 1 1.81 ~ 2.50 → 2 2.51 ~ 3.00 → 3 train LGBM OptimizedRounder test LGBM OptimizedRounder prob_target target 0.45 1 1.12 2 2.90 2 2.04 0 target 0.45 1.12 2.90 2.04 ɾ࠷దͳᮢͷಋग़ ɾճؼ͔Βྨ Λ࣮ࢪ͢Δɻ ίʔυΞϥΠ͞Μ͕·ͱΊͯ ͘Ε͍ͯΔ(*)ɻ (*)https://qiita.com/kaggle_master-arai-san/items/d59b2fb7142ec7e270a5 target 0 1 3 2 p.91,100,101
ྨࣅؔͱʁ 8 ͦΕͧΕͷؔͷಛΛ௫Έɺۙࣅ͢ΔؔΛબ͢Δɻ ɾQWKͷۙࣅ ɾMAEͷۙࣅ p.101~103 p.103 ਤ2.23
͋Γ͕ͱ͏͍͟͝·ͨ͠ɻ ▪ Kaggle: @Yust ▪ Twitter: @yust_kaggle ▪ e-mail:
[email protected]