Data-Driven Dimension Reduction

Slide 1

Slide 1 text

DATA-DRIVEN DIMENSION REDUCTION Pranay Seshadri Associate Professor in Aerospace Engineering Georgia Institute of Technology

Slide 2

Slide 2 text

DATA-DRIVEN DIMENSION REDUCTION ACKNOWLEDGEMENTS Much of the code was written by Henry Yuchi, Chun Yui Wong, with further edits made by Ashley Scillitoe. We are grateful to Paul Constantine — for without him, this deck wouldn’t exist. FOLKS WHO HAVE CONTRIBUTED

Slide 3

Slide 3 text

DATA-DRIVEN DIMENSION REDUCTION INTRODUCTION The trouble is that high-dimensional problems are inevitable and unavoidable. Thankfully, equadratures has a few utilities that may be useful. But before delving into them, it will be useful to talk about ideas like principal components analysis (PCA) and more generally the singular value decomposition. A FEW WORDS

Slide 4

Slide 4 text

DATA-DRIVEN DIMENSION REDUCTION OVERVIEW 1. PRINCIPAL COMPONENTS 2. A SIMPLE EXPERIMENT 3. PROJECTIONS & ZONOTOPES 4. ACTIVE SUBSPACES 5. RIDGE APPROXIMATIONS

Slide 5

Slide 5 text

DATA-DRIVEN DIMENSION REDUCTION Image source: Rolls-Royce Flickr

Slide 6

Slide 6 text

DATA-DRIVEN DIMENSION REDUCTION PRINCIPAL COMPONENTS

Slide 7

Slide 7 text

DATA-DRIVEN DIMENSION REDUCTION diagonal matrix of singular values Number of rows Singular values PRINCIPAL COMPONENTS

Slide 8

Slide 8 text

DATA-DRIVEN DIMENSION REDUCTION PRINCIPAL COMPONENTS

Slide 9

Slide 9 text

DATA-DRIVEN DIMENSION REDUCTION PRINCIPAL COMPONENTS

Slide 10

Slide 10 text

DATA-DRIVEN DIMENSION REDUCTION PRINCIPAL COMPONENTS

Slide 11

Slide 11 text

DATA-DRIVEN DIMENSION REDUCTION PRINCIPAL COMPONENTS

Slide 12

Slide 12 text

DATA-DRIVEN DIMENSION REDUCTION PRINCIPAL COMPONENTS Consider another scenario where we have a matrix of measurements (say CMM data). Wish to determine which modes of manufacturing variability are the greatest. We create a covariance matrix using the observed data and then find its eigendecompostion. A FEW WORDS eigenvectors eigenvalues Observations Dimensions

Slide 13

Slide 13 text

DATA-DRIVEN DIMENSION REDUCTION PRINCIPAL COMPONENTS SUMMARISING THOUGHTS *Candés, Li, Ma, Wright (2011) Robust principal component analysis? J. ACM. *Schölkopf, Smola, Müller (1997) Kernel principal component analysis. Springer. Rather than apply principal components to a single image, we can apply it to numerous images to find dominant linear directions across all the images. Utility is not restricted to images and videos, but to any matrix / vector database. More robust and nonlinear variants also exist. However, we can’t really use this if we are interested in input-output pairs. For instance, computer experiments are usually carried out via a uniform design of experiment. Naïvely applying PCA to the inputs does not facilitate output-driven dimension reduction.

Slide 14

Slide 14 text

DATA-DRIVEN DIMENSION REDUCTION OVERVIEW 1. PRINCIPAL COMPONENTS 2. A SIMPLE EXPERIMENT 3. PROJECTIONS & ZONOTOPES 4. ACTIVE SUBSPACES 5. RIDGE APPROXIMATIONS

Slide 15

Slide 15 text

DATA-DRIVEN DIMENSION REDUCTION 0.01 0.39 0.61 0.55 0.83 0.54 0.39 0.81 0.75 0.33 0.78 0.94 0.46 0.23 0.58 0.83 Generate 3 x 300 uniformly distribute random numbers between [0, 1]. For each set of numbers, evaluate . OUTPUT-BASED DIMENSION REDUCTION A SIMPLE EXPERIMENT

Slide 16

Slide 16 text

DATA-DRIVEN DIMENSION REDUCTION Generate 3 x 300 uniformly distribute random numbers between [0, 1]. For each set of numbers, evaluate . Plot vs. for OUTPUT-BASED DIMENSION REDUCTION A SIMPLE EXPERIMENT

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Slide 20

Slide 20 text

DATA-DRIVEN DIMENSION REDUCTION Once this subspace is available, it is relatively easy to construct a polynomial approximation of the form OUTPUT-BASED DIMENSION REDUCTION A SIMPLE EXPERIMENT Polynomial fit

Slide 21

Slide 21 text

DATA-DRIVEN DIMENSION REDUCTION OUTPUT-BASED DIMENSION REDUCTION ANALOGOUS TO A SINGLE LAYER NEURAL NETWORK Hidden layer Inputs Output

Slide 22

Slide 22 text

DATA-DRIVEN DIMENSION REDUCTION OUTPUT-BASED DIMENSION REDUCTION ANALOGOUS TO A SINGLE LAYER NEURAL NETWORK Hidden layer Inputs Output NOTE: In a classical neural network structure, we have no apriori notion of how many neurons we need per hidden layer and the relationship between them.

Slide 23

Slide 23 text

DATA-DRIVEN DIMENSION REDUCTION OVERVIEW 1. PRINCIPAL COMPONENTS 2. A SIMPLE EXPERIMENT 3. PROJECTIONS & ZONOTOPES 4. ACTIVE SUBSPACES 5. RIDGE APPROXIMATIONS

Slide 24

Slide 24 text

DATA-DRIVEN DIMENSION REDUCTION PROJECTIONS & ZONOTOPES SUBSPACE-BASED [LINEAR] PROJECTIONS Consider the d-dimensional input observations to some computational model, i.e., .

Slide 25

Slide 25 text

DATA-DRIVEN DIMENSION REDUCTION PROJECTIONS & ZONOTOPES SUBSPACE-BASED [LINEAR] PROJECTIONS Human A 3D object. Shadow A 2D projection of a 3D object. Be aware: understanding high- dimensional spaces is very important!

Slide 26

Slide 26 text

DATA-DRIVEN DIMENSION REDUCTION PROJECTIONS & ZONOTOPES SUBSPACE-BASED [LINEAR] PROJECTIONS Human A 3D object. Shadow A 2D projection of a 3D object. Consider the projection of a cube on a plane, it can either be a square or hexagon. Be aware: understanding high- dimensional spaces is very important!

Slide 27

Slide 27 text

DATA-DRIVEN DIMENSION REDUCTION PROJECTIONS & ZONOTOPES SUBSPACE-BASED [LINEAR] PROJECTIONS Generate random samples within a 𝖽 = 𝟥 dimensional cube and project using a random orthogonal projection. “Zonotope” Projection of hypercube on a plane

Slide 28

Slide 28 text

DATA-DRIVEN DIMENSION REDUCTION PROJECTIONS & ZONOTOPES SUBSPACE-BASED [LINEAR] PROJECTIONS Generate random samples within a 𝖽 = 𝟣𝟢 dimensional cube and project using a random orthogonal projection. “Zonotope” Projection of hypercube on a plane

Slide 29

Slide 29 text

DATA-DRIVEN DIMENSION REDUCTION PROJECTIONS & ZONOTOPES SUBSPACE-BASED [LINEAR] PROJECTIONS Generate random samples within a 𝖽 = 𝟧𝟢 dimensional cube and project using a random orthogonal projection. “Zonotope” Projection of hypercube on a plane

Slide 30

Slide 30 text

DATA-DRIVEN DIMENSION REDUCTION PROJECTIONS & ZONOTOPES SUBSPACE-BASED [LINEAR] PROJECTIONS Generate random samples within a 𝖽 = 𝟥𝟢𝟢 dimensional cube and project using a random orthogonal projection. “Zonotope” Projection of hypercube on a plane

Slide 31

Slide 31 text

DATA-DRIVEN DIMENSION REDUCTION OVERVIEW 1. PRINCIPAL COMPONENTS 2. A SIMPLE EXPERIMENT 3. PROJECTIONS & ZONOTOPES 4. ACTIVE SUBSPACES 5. RIDGE APPROXIMATIONS

Slide 32

Slide 32 text

By Serouj Ourishian - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=54972346 ACTIVE SUBSPACES A VISUALIZATION

Slide 33

Slide 33 text

By Serouj Ourishian - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=54972346 ACTIVE SUBSPACES A VISUALIZATION

Slide 34

Slide 34 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES Given a function, its gradient vector, and a weight function

Slide 35

Slide 35 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES Given a function, its gradient vector, and a weight function The average outer product of the gradient & its eigendecomposition is diagonal matrix of eigenvalues

Slide 36

Slide 36 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES Partition eigenvectors and eigenvalues eigenvalues log-scale

Slide 37

Slide 37 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES Partition eigenvectors and eigenvalues Eigenvalues measure ridge structure with eigenvectors averaged, squared, directional derivative along eigenvector eigenvalues log-scale

Slide 38

Slide 38 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES Non-dominant eigenvalues measure the approximation error active subspace

Slide 39

Slide 39 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES Non-dominant eigenvalues measure the approximation error conditional expectation active subspace Poincaré constant In practice, for computing the active subspace we require gradient evaluations (or approximations thereof) —> not always feasible.

Slide 40

Slide 40 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES Note that active subspaces is NOT principal component analysis (PCA)! aside active subspaces PCA

Slide 41

Slide 41 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES This identification of the subspace is important! Consider the following split

Slide 42

Slide 42 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES This identification of the subspace is important! Consider the following split

Slide 43

Slide 43 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES This identification of the subspace is important! Consider the following split active subpace inactive subpace

Slide 44

Slide 44 text

DATA-DRIVEN DIMENSION REDUCTION ACTIVE SUBSPACES Constructs a quadratic global polynomial model to estimate gradients. from equadratures import * space = Subspaces(method=‘active-subspace’, sample_points=X, sample_outputs=y) CODE

Slide 45

Slide 45 text

DATA-DRIVEN DIMENSION REDUCTION OVERVIEW 1. PRINCIPAL COMPONENTS 2. A SIMPLE EXPERIMENT 3. PROJECTIONS & ZONOTOPES 4. ACTIVE SUBSPACES 5. RIDGE APPROXIMATIONS

Slide 46

Slide 46 text

DATA-DRIVEN DIMENSION REDUCTION RIDGE APPROXIMATION Consider the task of approximating with a polynomial expansion, i.e., solving This can be done via least squares. One can consider solving this in other norms, leading to solutions via compressed sensing, or some combination thereof. where

Slide 47

Slide 47 text

DATA-DRIVEN DIMENSION REDUCTION RIDGE APPROXIMATION Now, we consider the approximation problem Hokanson & Constantine (2018). SIAM Journal on Scientific Computing. Golub & Pereyra (2003). Inverse Problems.

Slide 48

Slide 48 text

DATA-DRIVEN DIMENSION REDUCTION RIDGE APPROXIMATION Now, we consider the approximation problem The key insight is to re-write this as

Slide 49

Slide 49 text

DATA-DRIVEN DIMENSION REDUCTION RIDGE APPROXIMATION Now, we consider the approximation problem The key insight is to re-write this as which is equivalent to and only an optimisation problem over subspaces! matrix pseudoinverse

Slide 50

Slide 50 text

DATA-DRIVEN DIMENSION REDUCTION RIDGE APPROXIMATION ridge approximations •Require gradients or gradient estimates (e.g., adjoints). •May require many model evaluations. •The “reduced” dimension is easy to gauge based on eigenvalue decay. active subspaces •Does not require gradients or gradient estimates (e.g., adjoints). •Requires # of model evaluations based on “reduced” dimension. •The “reduced” dimension may need to be estimated.

Slide 51

Slide 51 text

DATA-DRIVEN DIMENSION REDUCTION RIDGE APPROXIMATIONS CODE from equadratures import * space = Subspaces(method=‘variable-projection’, sample_points=X, sample_outputs=y) M = space.get_subspace() subspace_poly = space.get_subspace_polynomial() sobol_indices = subspace_poly().get_sobol_indices() moments = subspace_poly().get_mean_and_variance() space.plot_2D_contour_zonotope()

Slide 52

Slide 52 text

LEARN CONTINUOUS STRUCTURE THANK YOU made by the equadratures team, with ❤