Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to GPVLM

Introduction to GPVLM

機械学習セミナーでの発表内容

Kazu Ghalamkari

April 23, 2020
Tweet

More Decks by Kazu Ghalamkari

Other Decks in Science

Transcript

  1. Gaussian Process Latent Variable Model [1] Neil Lawrence. Gaussian process

    latent variable models for visualization of high dimensional data. [2] Neil Lawrence. Probabilistic Non-linear Principal Component Analysis with Gaussian Process Latent Variable Models.
  2. Theory ・What Gaussian Process Regression is ・Gaussian Process Latent Variable

    Examples ・Easy experiment in Oil flow dataset with source code ・GPLVM as a generation model ・Phase transition related to hyper parameters of GP Related Models ・Infinite Warped Mixture Model, ・Gaussian Process Dynamical Model, GPDM Contents
  3. What Gaussian Process Regression is. Dataset = , | =

    1 ⋯ = = ෍ =1 ( ) Estimated by using MSE Φ, = ( ) : design matrix = ΦΦ −1Φ Linear Regression ( basis function (⋅) have to be given) Gaussian Process Regression ( Kernel function have to be given) We introduce prior distribution ~N , λ2 = follows gaussian distribution ~ , λ2 ≡ , (∗|∗, )~ ∗ T−1, ∗∗ − ∗ T−1∗ ∗ ~N , ∗ ∗ T k∗∗ ∗ = ∗, 1 , ⋯ , (∗, ) k∗∗ = ∗, ∗ ,′ = λ ′ = 1 exp − 1 2 − ′ 2 Example: RBF Kernel
  4. Introduction of GPLVM = , | = 1 ⋯ unknown

    given 1 = 1 (1) 1 (2) ⋮ 1 () 2 = 2 (1) 2 (2) ⋮ 2 () = (1) (2) ⋮ () = 1 (1) 1 (2) 2 (1) 2 (2) ⋯ 1 () ⋯ 2 () ⋮ ⋮ (1) (2) ⋱ ⋮ ⋯ () ∈ ℝ× () = 1 () 2 () ⋮ () is generated by common unknown N inputs = , ⋯ , by gaussian process regression. ()~(, + 2) = 1, ⋯ ,
  5. Introduction of GPLVM ()~(0, + 2) How should know ?

    Let us call latent variable. , = () = ෑ =1 () () = ෑ =1 () , + 2 () = ෑ =1 () , + 2 ෑ =1 ( ) = ෑ =1 () , + 2 ෑ =1 ( |0, ) Let us find X which maximize it No reason. cf. manifold hypothesis
  6. Introduction of GPLVM , = ෑ =1 () , +

    2 ෑ =1 ( |0, ) = ෑ =1 1 2 /2 1/2 exp − 1 2 T −1() ෑ =1 ( |0, ) = 1 2 /2 /2 exp − 1 2 ෍ =1 T −1() ෑ =1 ( |0, ) = 1 2 /2 /2 exp − 1 2 tr −1T ෑ =1 ( |0, ) Inner product of matrix and is tr T tr T is larger as = When , is large, −1 is similar to Correlation matrix of observed data T
  7. Example of GPLVM Let us experiment in oil flow dataset

    Tst = Test Vdn = Valid Trn = Train [1 0 0] [0 1 0] [1 0 0] 2 energy γ-ray 2 energy × 6 direction = 12 dimensional data
  8. Example of GPLVM Easy to run by GPy! Compare to

    PCA You can know confidence!! or GPythorch, TF Probability… (∗|∗, )~ ∗ T−, ∗∗ − ∗ T−∗
  9. Example of application of GPLVM 2 dimensional embedded latent space

    Scaled GPLVM From Style-Based Inverse Kinematics 2004
  10. What is the difference between GPLVM and VAE? feature space

    Decoded Data Decoded Data = Decoder() Trained VAE GPLVM Each point correspond to a decode sample. Each point correspond to gaussian distribution. We can extract data by sampling the distribution. sampling unique unique not unique 1 2 … You cannot know confidence in feature space. It might be overfitted. You can know confidence in latent space. It will not be overfitted. latent space (0, I) (0, I) (∗|∗, )~ ∗ T−, ∗∗ − ∗ T−∗
  11. Infinite Warped Mixture Model, iWMM , = ()= ෑ =1

    () , + 2 ෑ =1 ( |0, ) We assume explicitly the number of clusters in the latent space. , = ()= ෑ =1 () , + 2 ෑ =1 ෍ =1 λ ( | , −1) GMM Warped Mixtures for Nonparametric Cluster Shapes(2013) Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani It is not easy to run. We can find MATLAB code in GitHub.
  12. Gaussian Process Dynamical Model, GPDM = =1 , =2 ,

    ⋯ , = = =1 , =2 , ⋯ , = Observed variable Latent variable time developing time developing , = () = ෑ =1 () ෑ =2 ( |−1 ) −1 −1 +1 +1 GPVLM GPDM From Gaussian Process Dynamical Models 2008
  13. Conclusion and Discussion We can use gaussian process in unsupervised

    learning as GPLVM - Dimensional reduction - Clustering - Actually, GPLVM is generalized method of probabilistic PCA and Kernel PCA - Actually, Bayesian GPLVM is popular (link) We can use GPLVM as generation model - I will not be overfitted. - We can see confidence of latent space . There are some advanced model - Infinite Warped Mixture Model, iWMM - Gaussian Process Dynamical Model, GPDM - Discriminative Gaussian Process Latent Variable Model, discriminative GPLVM (link) - Supervised Latent Linear Gaussian Process Latent Variable Model, SLLGPLVM (link) Research Topics - Computational complexity is 3 . We have to calculate inverse matrix −1 - Analytical discussion of Generalization Gap of Gaussian Process (link) Data Augment from GPLVM?