Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
model_pipeline_final.pdf
Search
Maxwell
September 18, 2018
Science
1
200
model_pipeline_final.pdf
model pipeline and others in Home Credit Default Risk competition.
Thanks to team mates.
Maxwell
September 18, 2018
Tweet
Share
More Decks by Maxwell
See All by Maxwell
Causal Impact -paper summary-
hoxomaxwell
2
620
Great Barrier Reef Model Pipeline: 15th place
hoxomaxwell
1
170
Lecture materials at the University of Tokyo School of Medicine
hoxomaxwell
1
110
Kaggle Hungry Geese
hoxomaxwell
1
87
HuBMAP 17th place model pipeline
hoxomaxwell
1
72
LT: Shallow Dive into Bayes Factor
hoxomaxwell
6
1.2k
Kaggle APTOS 2019 @ U-Tokyo Med
hoxomaxwell
1
400
Cornell Birdcall 36th place solution
hoxomaxwell
2
210
Kaggle Bengali.AI 6 th place solution
hoxomaxwell
4
8.1k
Other Decks in Science
See All in Science
証明支援系LEANに入門しよう
unaoya
0
460
[第62回 CV勉強会@関東] Long-CLIP: Unlocking the Long-Text Capability of CLIP / kantoCV 62th ECCV 2024
lychee1223
1
760
ほたるのひかり/RayTracingCamp10
kugimasa
0
410
ACL読み会2024@名大 REANO: Optimising Retrieval-Augmented Reader Models through Knowledge Graph Generation
takuma_matsubara
0
100
はじめての「相関と因果とエビデンス」入門:“動機づけられた推論” に抗うために
takehikoihayashi
17
7k
学術講演会中央大学学員会いわき支部
tagtag
0
110
学術講演会中央大学学員会八王子支部
tagtag
0
240
化学におけるAI・シミュレーション活用のトレンドと 汎用原子レベルシミュレーター: Matlantisを使った素材開発
matlantis
0
300
いまAI組織が求める企画開発エンジニアとは?
roadroller
2
1.3k
はじめてのバックドア基準:あるいは、重回帰分析の偏回帰係数を因果効果の推定値として解釈してよいのか問題
takehikoihayashi
2
920
非同期コミュニケーションの構造 -チャットツールを用いた組織における情報の流れの設計について-
koisono
0
170
LIMEを用いた判断根拠の可視化
kentaitakura
0
370
Featured
See All Featured
Done Done
chrislema
181
16k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
48
2.2k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
59k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
280
13k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
28
2.1k
Art, The Web, and Tiny UX
lynnandtonic
298
20k
Building a Modern Day E-commerce SEO Strategy
aleyda
38
7k
Bootstrapping a Software Product
garrettdimon
PRO
305
110k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
0
97
The Cost Of JavaScript in 2023
addyosmani
45
7k
A Modern Web Designer's Workflow
chriscoyier
693
190k
A Tale of Four Properties
chriscoyier
157
23k
Transcript
ikiri_DS Model PipeLine 600+1 ( LB804 ) FEATURES 1000+1 (
LB803 ) meta app meta bur Kernel GP Nejumi features Tereka features + LGBM 5 3 tosh 5 + CatBoost 5 2 1 + LGBM * 4 3 1 + CNN 7 Residual 2 + ExtTree 4 3 1 Residual 1 ( corrected with residual regression ) Blending CV 0.8094 Adversarial Stochastic Blending CV 0.8096 Adversarial Stochastic Blending CV 0.81050 * model drawn in next page + NN 1 3 ONODERA Maxwell Nejumi Tereka RK 1 2 3 4 5 6 7 Branden features 8 Branden + NN 1 3 takuoko features 9 Angus features 10 takuoko nejumi feature Angus + Res2 + LGBM 1 6 + Res1 + LGBM 1 6 1 or 2 or 5 + LGBM 1 or 2 or 5 + CatBoost or + LGBM 5 1 or 2 5 + LGBM 8 + LGBM 9 + LGBM 10 Adversarial Stochastic Blending CV : 0.8061 29.Aug.2018 Tam Tam features 11 + LGBM 11 + RGF 1 + LGBM 11 + RNN 7 1 * using hidden layer as additional features to correct residuals. + CNN 7 + hidden + Res3 + LGBM 1 6 + RGF 1 + Res2 + LGBM 1 6 + LGBM 5 RK features 12 + LGBM 12 1 or 2 12 + LGBM 8 1 or 2 8 + LGBM 3 1 5 or 3 2 5 + LGBM 8 1 12 or 8 2 12 Public 0.8085 17 th Private 0.8017 18 th + LGBM 8 + LGBM 9 + LGBM 10 Ireko DAE 13 Ireko8 + NN 1 13 + NN 1 + NN 1 13 Nejumi prediction Public 0.8093 10 th Private 0.8016 18 th Public 0.8080 23 th Private 0.8028 14 th + RNN 7 1 Public 0.8110 3 rd Private 0.8042 5 th Giba Post Processing Public 2nd 0.81241 Private 2nd 0.80561 Home Credit Default Risk partial partial partial + LGBM 8 1 or 2 8 or 12 + LGBM 3 1 or 2 3 or 12 3 + LGBM 6 1 Residual 3 + hidden + LGBM 1 6' or 6' 1 + LGBM 6' 2 Blending
ikiri_DS Model PipeLine 600+1 ( LB804 ) FEATURES 1000+1 (
LB803 ) meta app meta bur Kernel GP Nejumi features Tereka features tosh + LGBM * 4 3 1 + CNN 7 Residual 2 Residual 1 ( corrected with residual regression ) Blending CV 0.8085 Adversarial Stochastic Blending CV 0.8085 Adversarial Stochastic Blending CV 0.8097 * model drawn in next page ONODERA Maxwell Nejumi Tereka RK 1 2 3 4 5 6 7 Branden features 8 Branden + NN 1 3 takuoko features 9 Angus features 10 takuoko nejumi feature Angus + Res2 + LGBM 1 6 + Res1 + LGBM 1 6 + LGBM 8 + LGBM 9 + LGBM 10 Adversarial Stochastic Blending CV : 0.8061 29.Aug.2018 Tam Tam features 11 + LGBM 11 + LGBM 11 + RNN 7 1 * using hidden layer as additional features to correct residuals. + CNN 7 + hidden + Res3 + LGBM 1 6 + RGF 1 + Res2 + LGBM 1 6 + LGBM 5 RK features 12 + LGBM 12 1 or 2 12 + LGBM 8 1 or 2 8 Public 0.8071 26 th Private 0.8009 37 th + LGBM 8 + LGBM 9 + LGBM 10 Ireko DAE 13 Ireko8 + NN 1 13 + NN 1 + NN 1 13 Nejumi prediction Public 0.8082 23 th Private 0.8022 18 th Public 0.8080 23 th Private 0.8028 14 th Public 0.8099 7 th Private 0.8040 6 th Giba Post Processing Home Credit Default Risk partial + LGBM 8 1 12 or 8 2 12 partial 1 or 2 + LGBM + LGBM 6 1 Residual 3 + hidden + LGBM 1 6' or 6' 1 + LGBM 6' 2 Blending + ExtTree 4 3 1 + NN 1 3 + RGF 1 + LGBM 4 3 2 + XGB 4 3 1 + NN 1 + RNN 7 1 + hidden + Res3 + LGBM 1 6 + Res1 + LGBM 1 6 + hidden + Res4 + LGBM 1 6 stacking with LGBM CV 0.8080 Public 0.8070 / Private 0.8015 Stacking prediction Stacking + LGBM 3 1 or 2 3
application bureau bureau balance AUC : 0.683 (SEED71) 0.683 (SEEDs
avg) AUC 0.772 (SEED71) 0.773 (SEEDs avg) XGBoost app meta feature XGBoost prev meta feature 229 features 300 features all data stacking-like Light GBM 5 stratified fold ( shuffle = True ) 5 / 8 SEEDs rank averaged SEED : 71 for model fit SEED : 710, 711, 712, 713, 714 ( 715, 716, 717 ) for OOF prediction hyper parameter tuned for 603 features (reflected on meta features) XGBoost bureau meta feature ONODERA BASIC FEATURES 600 features NEJUMI FEATURES ( interest rate ) 1 feature 603 ( 604 ) features Local CV 0.80641 Public LB / Private LB 0.80569 / 0.79853 100 th / 105 th AUC 0.710 (SEED71) 0.712 (SEEDs avg) previous inst POS_CASH credit 952 features Local CV 0.80646 LB 0.804 ( ~ 0.805 ) Maxwell 603 ( 604 ) selected features based on ONODERA criteria w/o feature selection Stacking-like Light GBM