Bayesian Optimizationを用いたパラメータチューニング

0 2 . .. 2 2 1 Bayesian Optimization #
'!'& 20194 13 "% &$&

0 2 . .. 2 2 1 1.
2. 3. Bayesian Optimization 4. Tree-structured Parzen Estimator(TPE) 5. hyper opt optuna

1 00 . 21 0 1 1 00 . 6<>1=3<49'.728>8;:>5
• XGBoost+Gradient Boosting%! 6<> (1=3<49') (,&728>8;:>5. • n_estimator • max_depth • min_sample_split • max_leaf_nodes etc... )/-8;:>5( '"#. * ( $)(728>8;:>508;:>5%(*#.

1 00 . 21 0 1 1 00 . ,1.5'(/5+4#)52
• %0-"3)52 • Grid Search • Random Search • ʢManual Searchʣ ! $&*

1 00 . 21 0 1 1 00 .
GPyOpt hyperopt optuna TPOT deap auto_ml BTB Chocolate Cornel-MOE devol H2O HORD HPOlib HpBandSter hypergrad mlrLBO pbt https://medium.com/@mikkokotila/a-comprehensive-list-of-hyperparameter-optimization-tuning-solutions-88e067f19d9 pycma rfbopt ROBO spearmint test-tube TUNE

1 00 . 21 0 1 1 00 .
• Bayesian OptimizationSequential Model-Based Optimization/SMBO) Tree-structured Parzen EstimatorTPE) • hyperopt Optuna

1 00 . 21 0 1 1 00 . Bayesian
Optimization #hyperopt0optuna,:=7-Bayesian Optimization Bayesian Optimization4 .$+.)/3),2 • !!(#)4 +&31#*9<;=8%4& • , + '%,%4 • !4"(,4 • !(#), -657 >Gaussian Process?+#)

1 00 . 21 0 1 1 00 . $&(!
"-/.0,!+ %)!$ + !$( !∗ = argmin *∈, -(!) **$'#) • ! ... -/.0, • "(!)... 1ex. RMSE2

1 00 . 21 0 1 1 00 . !(#)(
&=#%" $" !' ← # !(#)

1 00 . 21 0 1 1 00 . 1
1

1 00 . 21 0 1 1 00 . 2
!'%"(!) ! %! !$ '& !'# %$!"( ! "(!) "(!) "(!)

1 00 . 21 0 1 1 00 . 3
! !#" !"(!) "(!)

1 00 . 21 0 1 1 00 . 4
!(#) ")!(#)"(! !"# , " +$& #,+$ *# -'%(. #!(#) !(#)

1 00 . 21 0 1 1 00 . !(#)
5 #!(#) !(#)

1 00 . 21 0 1 1 00 . #!
!# "(!)# " "#"(!) ! →" ! → "(!) "(!)

1 00 . 21 0 1 1 00 . 00D;@D<E
• 57A!(#)@LH>( ← !(#)ARUPMSPMN • (8C(#JJB9-=0+!14)*%?K*I ←0JD;@LF<?@"#L 2:(I 2*2 • !(#)@L1936/;F ←00;G8:(I@A'.D;QUTVO@# 4@<2:&@ +'I5E$+**I • →;,I6/>( ;#LE5(

1 00 . 21 0 1 1 00 .
• & #5-!4%"*) +"(!)4/(!3 • $$"(!)-41(%'!)0 ← *12789- 43 • ,.PI EI 3.EI 43 • ; EI'∗ ! ≔ * +, , max 0∗ − 0, 0 45 0 ! 60 &'$ɺ0∗. "(!) -:89876)0.-

1 00 . 21 0 1 1 00 . EI
9Expected Improvement: • ; EI#∗ % ≔ ' () ) max -∗ − -, 0 12 - % 3- • 4(%)) )-)$*'2+$)867-∗0&2!) ,131 • Improvement3 1/'%3 • 1(-|%)*&."#21< ! = #(%)*546(%

1 00 . 21 0 1 1 00 . Gaussian
Process20/13 ' ,!.!"#$ !(&(%(!)( .("#$ & ,& ("#$( )(+'-, ) ("#$ *" , !"#$ ~-(." !"#$ , /" 0 !"#$ + /23456 0 ) #$!-"-(+'-, • !" ... -*%'-#"&#( • $" " ... ".#&( • %" 0 " ... ".#&(

1 00 . 21 0 1 1 00 . %#&
!" # $" % # $)'(&$)'( '" ! • !" " = $([& + ()*+,- % )]./+ • (" % " ... , ""0/ , ""0/ − $([& + ()*+,- % )]./$ ! $)'(& #+ , #2 • , "+ , "2 = exp / % 2+ − 22 % →

1 00 . 21 0 1 1 00 . Gaussian
Process • !" # $" % # EI(∗ # ≔ ∫ ,- - max !∗ − !, 0 34 5 # 65 = ∫ ,- - max !∗ − !, 0 8(!" # , $" % # ) 65 = max !∗ − !, 0 Φ 8 #" $ , &" % $ + $" % # 8 #" $ , &" % $ ΦCumlative Distribution Function * $=>?@A % * Web !

1 00 . 21 0 1 1 00 .
• !"(!) ! https://arxiv.org/pdf/1012.2599.pdf

1 00 . 21 0 1 1 00 . !$!#
• !$ ! EI$∗ ! • " ! https://arxiv.org/pdf/1012.2599.pdf

1 00 . 21 0 1 1 00 . Bayesian
Optimization+&, • !(#)/(".-'46573%/ " • ) (# )/ • !(#)%*'! /$%/ • (*Expected Improvement/ • !(#)) 1028Gaussian Process9( &

1 00 . 21 0 1 1 00 . Bayesian
Optimization • Bayesian Optimization BayesianOptimization.maximize - init_points ... - iter ... - acq ... - kernel ...

1 00 . 21 0 1 1 00 .
• Bayesian OptimizationmaximizeUtilityFunction • acq=‘ei’ UtilityFunction _ei

1 00 . 21 0 1 1 00 . Tree-structured
Parzen EstimatorʢTPEʣ '#% !&!(#|%)* hyperopt/Optuna! ) (TPE!&!(%|#)"!(#)*)) TPE$ !(%|#)&$ %*( ! % # = ( ) % if # < #∗ - % if # ≥ #∗

1 00 . 21 0 1 1 00 . Tree-structured
Parzen Estimator Expected Improvement ! " g " EI " = ∫ !" #∗ '∗ − ' ) ' " *' ∝ , + % & ' & 1 − , !(

1 00 . 21 0 1 1 00 . TPE
EI # = ∫ &' (∗ *∗ − * , * # -* = ∫ &' (∗ *∗ − * . # * .(() .(1) -* = 2(∗3 1 &3(1) ∫45 6∗ . ( 7( 23 1 8 9&2 :(1) ∝ < + : 1 3 1 1 − < &9 , # * = ? @ # if * < *∗ D # if * ≥ *∗ < = , * < *∗ , # = F ℝ , # * , * -* = <@ # + 1 − < D(#) *

1 00 . 21 0 1 1 00 . hyperopt4Optuna
Optuna.hyperopt65 8GML4936>@KKMC;= • Define-by-Run CDAL8 API • "!@,2($8= ← "!-<6 @0+#, .&):?6,($@7/1> • * • EFBJIMH7;> ' N %O * : https://research.preferred.jp/2018/12/optuna-release/

1 00 . 21 0 1 1 00 .
&XGBoost + hyperopt' • "$#%! fmin • optimizefmin OK

1 00 . 21 0 1 1 00 . #XGBoost
+ optuna$ hyperopt hyperopt optimizeparam ! "

1 00 . 21 0 1 1 00 .
• Algorithms for Hyper-Parameter Optimization https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf • A Conceptual Explanation of Bayesian Hyperparameter Optimization for Machine Learning https://towardsdatascience.com/a-conceptual-explanation-of-bayesian-model-based-hyperparameter-optimization-for-machine-learning-b8172278050f • A Comparative Study of Black-box Optimization Algorithms for Tuning of Hyper-parameters in Deep Neural Networks http://www.diva-portal.org/smash/get/diva2:1223709/FULLTEXT01.pdf • Hyperopt the Xgboost model - Kaggle https://www.kaggle.com/yassinealouini/hyperopt-the-xgboost-model • Hyperparameter optimization for Neural Networks – NeuPy http://neupy.com/2016/12/17/hyperparameter_optimization_for_neural_networks.html#bayesian-optimization • Hyperparameter tuning in Cloud Machine Learning Engine using Bayesian Optimization – Google Cloud Blog https://cloud.google.com/blog/products/gcp/hyperparameter-tuning-cloud-machine-learning-engine-using-bayesian-optimization • A Introductioy Example of Bayesian Optimization in Python with hyperopt https://towardsdatascience.com/an-introductory-example-of-bayesian-optimization-in-python-with-hyperopt-aae40fff4ff0 • A Tutorial on Bayesian Optimization https://arxiv.org/pdf/1807.02811.pdf • A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning https://arxiv.org/pdf/1012.2599.pdf • A Tutorial on Gaussian Processes (or why I don’t use SVMs) http://mlss2011.comp.nus.edu.sg/uploads/Site/lect1gp.pdf • Optuna(TPE)*)% – Qiita https://qiita.com/nabenabe0928/items/708d221dbccebf31f01c • OptunaXGBoost"#,#(&, ',!+ http://www.algo-fx-blog.com/xgboost-optuna-hyperparameter-tuning/ • $ https://www.slideshare.net/hoxo_m/ss-77421091

Bayesian Optimizationを用いたパラメータチューニング

Bayesian Optimizationを用いたパラメータチューニング

Waku Michishita

More Decks by Waku Michishita

Other Decks in Programming

Featured

Transcript

0 2 . .. 2 2 1 Bayesian Optimization #

0 2 . .. 2 2 1 1.

1 00 . 21 0 1 1 00 . 6<>1=3<49'.728>8;:>5

1 00 . 21 0 1 1 00 . ,1.5'(/5+4#)52

1 00 . 21 0 1 1 00 .

1 00 . 21 0 1 1 00 .

1 00 . 21 0 1 1 00 . Bayesian

1 00 . 21 0 1 1 00 . $&(!

1 00 . 21 0 1 1 00 . !(#)(

1 00 . 21 0 1 1 00 . 1

1 00 . 21 0 1 1 00 . 2

1 00 . 21 0 1 1 00 . 3

1 00 . 21 0 1 1 00 . 4

1 00 . 21 0 1 1 00 . !(#)

1 00 . 21 0 1 1 00 . #!

1 00 . 21 0 1 1 00 . 00D;@D<E

1 00 . 21 0 1 1 00 .

1 00 . 21 0 1 1 00 . EI

1 00 . 21 0 1 1 00 . Gaussian

1 00 . 21 0 1 1 00 . %#&

1 00 . 21 0 1 1 00 . Gaussian

1 00 . 21 0 1 1 00 .

1 00 . 21 0 1 1 00 . !$!#

1 00 . 21 0 1 1 00 . Bayesian

1 00 . 21 0 1 1 00 . Bayesian

1 00 . 21 0 1 1 00 .

1 00 . 21 0 1 1 00 . Tree-structured

1 00 . 21 0 1 1 00 . Tree-structured

1 00 . 21 0 1 1 00 . TPE

1 00 . 21 0 1 1 00 . hyperopt4Optuna

1 00 . 21 0 1 1 00 .

1 00 . 21 0 1 1 00 . #XGBoost

1 00 . 21 0 1 1 00 .