test (Private) test (Public) 2016 time 2017 2018 site_id, building_id, meter Solution ɾ1st Model : Site, Meter, Building_id+Meter cv : ࣌ܥྻͷCrossValidationʢ୯७ʹׂ࣌ؒʣ ɾ2nd Model : Site+Meter cv : LeakValidation ɾ5th Model : Building_id+Meter cv(fit):TimeSplit(1-5 / 9-12) cv(predict) : use all train train / test → by time
Site, Meter, Building_id+Meter cv : ࣌ܥྻͷCrossValidationʢ୯७ʹׂ࣌ؒʣ ɾ2nd Model : Site+Meter cv : LeakValidation ɾ5th Model : Building_id+Meter cv(fit):TimeSplit(1-5 / 9-12) cv(predict) : use all train train test (Private) test (Public) 2016 time 2017 2018 site_id, building_id, meter …ͪͳΈʹ զʑɺStratifiedKFoldΛ༻͠ɺ535/3614 (124→535) ڭ܇ɿStratifiedKFoldճؼʹΘͳ͍ʂ ֤ࠃͷݐͷ༻ిྔਫྔΛ༧ଌ͢Δίϯϖɻ ࣌ܥྻͰ͚ͨͷ͕ଟ͔ͬͨɻ·ͨɺLeakΛValidʹ͏LeakValidationྲྀߦͨ͠ɻ ্Ґਞɺid͝ͱʹModelΛ࡞͍ͯͨ͠ɻid͝ͱʹ͕େ͖͘ҟͳΔͨΊͱࢥΘΕΔɻ train / test → by time
test (Public) time User औҾͷτϥϯβΫγϣϯ͕͔ٗͲ͏͔Λ༧ଌ͢Δίϯϖɻ ࣌ؒͰ͔Ε͓ͯΓɺUserͷID͕ଘࡏ͠ͳ͔ͬͨɻ ͔࣮͠͠ࡍɺಉ͡UserͷऔҾ1ϲ݄Ҏʹूத͍ͯͨ͠ɻ զʑͷఆ trainͱtestɺશUserʹରͯ࣌͠ܥྻͰׂ ࣮ࡍ trainͱtestɺ࣮࣭UserͰׂ IEEE-CIS Fraud Detection
cv (predict) : GropKFold by Month ɾ2nd cv (fit) : time holdout cv (predict) : 1. use all train 2.Time KFold ɾ5th cv : GropKFold by Month …ͪͳΈʹ զʑuse all trainΛ༻͠ɺ 1237/6381 (796→1237) train test (Private) test (Public) time User औҾͷτϥϯβΫγϣϯ͕͔ٗͲ͏͔Λ༧ଌ͢Δίϯϖɻ ࣌ؒͰ͔Ε͓ͯΓɺUserͷID͕ଘࡏ͠ͳ͔ͬͨɻ ͔࣮͠͠ࡍɺಉ͡UserͷऔҾ1ϲ݄Ҏʹूத͍ͯͨ͠ɻ train / test → by user(≒ Month)