Slide 52
Slide 52 text
Reference
β’ Kato, M., and Teshima, T. (2022), βNon-negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation,β ,β in International Conference on Machine Learning.
β’ Kato, M., Imaizumi, M., McAlinn, K., Yasui, S., and Kakehi, H. (2022), βLearning Causal Relationships from Conditional Moment Restrictions by Importance Weighting,β in International Conference on
Learning Representations.
β’ Kato, M., Imaizumi, M., and Minami, K. (2022), βUnified Perspective on Probability Divergence via Maximum Likelihood Density Ratio Estimation: Bridging KL-Divergence and Integral Probability Metrics,β .
β’ Kanamori, T., Hido, S., and Sugiyama, M. (2009), βA least-squares approach to direct importance estimation.β Journal of Machine Learning Research, 10(Jul.):1391β1445.
β’ Kiryo, R., Niu, G., du Plessis, M. C., and Sugiyama, M. (2017), βPositive-Unlabeled Learning with Non-Negative Risk Estimator,β in Conference on Neural Information Processing Systems.
β’ Imbens, G. W. and Lancaster, T. (1996), βEfficient estimation and stratified sampling,β Journal of Econometrics, 74, 289β318.
β’ Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., and Robins, J.(2018), βDouble/debiased machine learning for treatment and structural parameters,β Econometrics Journal,
21, C1βC68.
β’ Good, I. J. and Gaskins, R. A. (1971), βNonparametric Roughness Penalties for Probability Densities,β Biometrika, 58, 255β277.
β’ Sugiyama, M., Nakajima, S., Kashima, H., von BΓΌnau, P., and Kawanabe, M. (2007). Direct importance estimation with model selection and its application to covariate shift adaptation. In Proceedings of the
20th International Conference on Neural Information Processing Systems (NIPS'07). Curran Associates Inc., Red Hook, NY, USA, 1433β1440.
β’ Sugiyama, M., Suzuki, T., and Kanamori, T. (2011), βDensity Ratio Matching under the Bregman Divergence: A Unified Framework of Density Ratio Estimation,β Annals of the Institute of Statistical
Mathematics, 64.β (2012), Density Ratio Estimation in Machine Learning, New York, NY, USA: Cambridge University Press, 1st ed.
β’ Sugiyama, M., (2016), βIntroduction to Statistical Machine Learning.β
β’ Silverman, B. W. (1982), βOn the Estimation of a Probability Density Function by the Maximum Penalized Likelihood Method,β The Annals of Statistics, 10, 795 β 810. 2
β’ Suzuki, T., Sugiyama, M., Sese, Jun., and Kanamori, T. (2008). Approximating mutual information by maximum likelihood density ratio estimation. In Proceedings of the Workshop on New Challenges for
Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008,volume 4 of Proceedings of Machine Learning Research, pp. 5β20. PMLR.
β’ Uehara, M., Sato, I., Suzuki, M., Nakayama, K., and Matsuo, Y. (2016), βGenerative Adversarial Nets from a Density Ratio Estimation Perspective.β
β’ Tran, D., Ranganath, R., and Blei, D. M. (2017), βHierarchical Implicit Models and Likelihood-Free Variational Inference,β in International Conference on Neural Information, Red Hook, NY, USA, p. 5529β
5539.
β’ Nguyen, X., Wainwright, M. J., and Jordan, M. (2008), βEstimating divergence functionals and the likelihood ratio by penalized convex risk minimization,β in Conference on Neural Information Processing
Systems, vol. 20.
β’ Whitney K. Newey and James L. Powell. Instrumental variable estimation of nonparametric models. Econometrica, 71(5):1565β1578, 2003.
β’ Hido, S., Tsuboi, Y., Kashima, H., Sugiyama, M., and Kanamori, T. (2011), βStatistical outlier detection using direct density ratio estimation,β Knowledge and Information Systems, 26, 309β336
β’ Lai, T. and Robbins, H. (1985), βAsymptotically efficient adaptive allocation rules,β Advances in Applied Mathematics
β’ Kaufmann, E., Cappe, O., and Garivier, A. (2016), βOn the Complexity of Best-Arm Identification in Multi-Armed Β΄ Bandit Models,β Journal of Machine Learning Research, 17, 1β42
β’ Fan, X., Grama, I., and Liu, Q. (2013), βCramer large deviation expansions for martingales under Bernsteinβs condi- Β΄ tion,β Stochastic Processes and their Applications, 123, 3919β3942.
β’ Fan, X., Grama, I., and Liu, Q. (2014), βA generalization of Cramer large deviations for martingales,β Β΄ Comptes Rendus Mathematique, 352, 853β 858.
β’ Shimodaira, H. (2000), βImproving predictive inference under covariate shift by weighting the log-likelihood function,β Journal of statistical planning and inference, 90, 227β244.
52