Bayesian Optimizationを用いたパラメータチューニング

Bayesian Optimizationを用いたパラメータチューニング

87bfa5d458ffdd284199bef5ba1ed481?s=128

Waku Michishita

April 13, 2019
Tweet

Transcript

  1. 0 2 . .. 2 2 1 Bayesian Optimization #

    '!'& 20194 13 "% &$&    
  2. 0 2 . .. 2 2 1 1.  

    2.   3. Bayesian Optimization 4. Tree-structured Parzen Estimator(TPE) 5. hyper opt optuna   
  3. 1 00 . 21 0 1 1 00 . 6<>1=3<49'.728>8;:>5

    • XGBoost+Gradient Boosting%! 6<> (1=3<49') (,&728>8;:>5. • n_estimator • max_depth • min_sample_split • max_leaf_nodes etc... )/-8;:>5( '"#. * ( $)(728>8;:>508;:>5%(*#.
  4. 1 00 . 21 0 1 1 00 . ,1.5'(/5+4#)52

     • %0-"3)52  • Grid Search • Random Search • ʢManual Searchʣ ! $&*  
  5. 1 00 . 21 0 1 1 00 . 

           GPyOpt hyperopt optuna TPOT deap auto_ml BTB Chocolate Cornel-MOE devol H2O HORD HPOlib HpBandSter hypergrad mlrLBO pbt https://medium.com/@mikkokotila/a-comprehensive-list-of-hyperparameter-optimization-tuning-solutions-88e067f19d9     pycma rfbopt ROBO spearmint test-tube TUNE
  6. 1 00 . 21 0 1 1 00 . 

        •   Bayesian OptimizationSequential Model-Based Optimization/SMBO) Tree-structured Parzen EstimatorTPE) •   hyperopt Optuna
  7. 1 00 . 21 0 1 1 00 . Bayesian

    Optimization #hyperopt0optuna,:=7-Bayesian Optimization Bayesian Optimization4 .$+.)/3),2 • !!(#)4 +&31#*9<;=8%4& • , + '%,%4 • !4"(,4 • !(#), -657 >Gaussian Process?+#)
  8. 1 00 . 21 0 1 1 00 . $&(!

    "-/.0,!+ %)!$  + !$( !∗ = argmin *∈, -(!) **$'#) • ! ... -/.0, • "(!)...  1ex. RMSE2
  9. 1 00 . 21 0 1 1 00 . !(#)(

      &=#%" $" !' ← # !(#)      
  10. 1 00 . 21 0 1 1 00 . 1

    1   
  11. 1 00 . 21 0 1 1 00 . 2

    !'%"(!)  ! %! !$ '&  !'# %$!"( ! "(!) "(!)   "(!)   
  12. 1 00 . 21 0 1 1 00 . 3

       !     !#"  !"(!)  "(!)            
  13. 1 00 . 21 0 1 1 00 . 4

    !(#)  ")!(#)"(! !"# , " +$& #,+$ *# -'%(.   #!(#)  !(#)        
  14. 1 00 . 21 0 1 1 00 . !(#)

        5 #!(#)  !(#) 
  15. 1 00 . 21 0 1 1 00 . #!

    !# "(!)#  " "#"(!) !  →" !  → "(!)  "(!)
  16. 1 00 . 21 0 1 1 00 . 00D;@D<E

    • 57A!(#)@LH>( ← !(#)ARUPMSPMN • (8C(#JJB9-=0+!14)*%?K*I ←0JD;@LF<?@"#L 2:(I 2*2 • !(#)@L1936/;F ←00;G8:(I@A'.D;QUTVO@# 4@<2:&@ +'I5E$+**I • →;,I6/>( ;#LE5(
  17. 1 00 . 21 0 1 1 00 . 

     • & #5-!4%"*) +"(!)4/(!3 • $$"(!)-41(%'!)0 ←  *12789- 43 •   ,.PI EI 3.EI 43 • ; EI'∗ ! ≔ * +, , max 0∗ − 0, 0 45 0 ! 60 &'$ɺ0∗. "(!) -:89876)0.- 
  18. 1 00 . 21 0 1 1 00 . EI

    9Expected Improvement: • ; EI#∗ % ≔ ' () ) max -∗ − -, 0 12 - % 3- • 4(%)) )-)$*'2+$)867-∗0&2!)  ,131 • Improvement3 1/'%3 • 1(-|%)*&."#21<  ! = #(%)*546(%
  19. 1 00 . 21 0 1 1 00 . Gaussian

    Process20/13 '  ,!.!"#$ !(&(%(!)( .("#$ & ,& ("#$( )(+'-, ) ("#$ *" , !"#$ ~-(." !"#$ , /" 0 !"#$ + /23456 0 ) #$!-"-(+'-, • !" ... -*%'-#"&#( • $" " ... ".#&(  • %" 0 " ... ".#&( 
  20. 1 00 . 21 0 1 1 00 . %#&

      !" # $" % #  $)'(&$)'( '"   ! • !" " = $([& + ()*+,- % )]./+ • (" % " ... , ""0/ , ""0/ − $([& + ()*+,- % )]./$  ! $)'(& #+ , #2  • , "+ , "2 = exp / % 2+ − 22 % → 
  21. 1 00 . 21 0 1 1 00 . Gaussian

    Process  • !" # $" % #     EI(∗ # ≔ ∫ ,- - max !∗ − !, 0 34 5 # 65 = ∫ ,- - max !∗ − !, 0 8(!" # , $" % # ) 65 = max !∗ − !, 0 Φ 8 #" $ , &" % $ + $" % # 8 #" $ , &" % $ ΦCumlative Distribution Function * $=>?@A %   *     Web !
  22. 1 00 . 21 0 1 1 00 . 

    • !"(!) !  https://arxiv.org/pdf/1012.2599.pdf
  23. 1 00 . 21 0 1 1 00 . !$!#

    • !$ ! EI$∗ !   • "  ! https://arxiv.org/pdf/1012.2599.pdf
  24. 1 00 . 21 0 1 1 00 . Bayesian

    Optimization+&, • !(#)/(".-'46573%/ " • ) (# )/ • !(#)%*'! /$%/  •  (*Expected Improvement/ • !(#)) 1028Gaussian Process9( &
  25. 1 00 . 21 0 1 1 00 . Bayesian

    Optimization  • Bayesian Optimization   BayesianOptimization.maximize - init_points ...   - iter ...  - acq ...  - kernel ...    
  26. 1 00 . 21 0 1 1 00 . 

    • Bayesian OptimizationmaximizeUtilityFunction • acq=‘ei’ UtilityFunction _ei  
  27. 1 00 . 21 0 1 1 00 . Tree-structured

    Parzen EstimatorʢTPEʣ '#% !&!(#|%)*   hyperopt/Optuna! ) (TPE!&!(%|#)"!(#)*)) TPE$ !(%|#)&$ %*( ! % # = ( ) % if # < #∗ - % if # ≥ #∗
  28. 1 00 . 21 0 1 1 00 . Tree-structured

    Parzen Estimator  Expected Improvement   ! " g "  EI " = ∫ !" #∗ '∗ − ' ) ' " *' ∝ , + % & ' & 1 − , !(
  29. 1 00 . 21 0 1 1 00 . TPE

       EI # = ∫ &' (∗ *∗ − * , * # -* = ∫ &' (∗ *∗ − * . # * .(() .(1) -* = 2(∗3 1 &3(1) ∫45 6∗ . ( 7( 23 1 8 9&2 :(1) ∝ < + : 1 3 1 1 − < &9    , # * = ? @ # if * < *∗ D # if * ≥ *∗ < = , * < *∗ , # = F ℝ , # * , * -* = <@ # + 1 − < D(#) *   
  30. 1 00 . 21 0 1 1 00 . hyperopt4Optuna

    Optuna.hyperopt65 8GML4936>@KKMC;= • Define-by-Run CDAL8 API • "!@,2($8= ← "!-<6 @0+#, .&):?6,($@7/1> • * • EFBJIMH7;> ' N %O * : https://research.preferred.jp/2018/12/optuna-release/
  31. 1 00 . 21 0 1 1 00 . 

    &XGBoost + hyperopt' • "$#%! fmin  • optimizefmin OK
  32. 1 00 . 21 0 1 1 00 . #XGBoost

    + optuna$ hyperopt hyperopt optimizeparam  ! " 
  33. 1 00 . 21 0 1 1 00 . 

     • Algorithms for Hyper-Parameter Optimization https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf • A Conceptual Explanation of Bayesian Hyperparameter Optimization for Machine Learning https://towardsdatascience.com/a-conceptual-explanation-of-bayesian-model-based-hyperparameter-optimization-for-machine-learning-b8172278050f • A Comparative Study of Black-box Optimization Algorithms for Tuning of Hyper-parameters in Deep Neural Networks http://www.diva-portal.org/smash/get/diva2:1223709/FULLTEXT01.pdf • Hyperopt the Xgboost model - Kaggle https://www.kaggle.com/yassinealouini/hyperopt-the-xgboost-model • Hyperparameter optimization for Neural Networks – NeuPy http://neupy.com/2016/12/17/hyperparameter_optimization_for_neural_networks.html#bayesian-optimization • Hyperparameter tuning in Cloud Machine Learning Engine using Bayesian Optimization – Google Cloud Blog https://cloud.google.com/blog/products/gcp/hyperparameter-tuning-cloud-machine-learning-engine-using-bayesian-optimization • A Introductioy Example of Bayesian Optimization in Python with hyperopt https://towardsdatascience.com/an-introductory-example-of-bayesian-optimization-in-python-with-hyperopt-aae40fff4ff0 • A Tutorial on Bayesian Optimization https://arxiv.org/pdf/1807.02811.pdf • A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning https://arxiv.org/pdf/1012.2599.pdf • A Tutorial on Gaussian Processes (or why I don’t use SVMs) http://mlss2011.comp.nus.edu.sg/uploads/Site/lect1gp.pdf • Optuna(TPE)*)% – Qiita https://qiita.com/nabenabe0928/items/708d221dbccebf31f01c • OptunaXGBoost"#,#(&, ',!+ http://www.algo-fx-blog.com/xgboost-optuna-hyperparameter-tuning/ •  $ https://www.slideshare.net/hoxo_m/ss-77421091