;
Mehdi Cherti (Appstat, LAL/CNRS) Supervised by Balázs Kégl (LAL/CNRS) and Alexandre Gramfort (CNRS LTCI)
Py-Earth: Multivariate Adaptive Regression Splines (
MARS) in Python
October 26, 2015 1 / 25
Slide 2
Slide 2 text
;
Py-Earth: Multivariate Adaptive Regression
Splines (
MARS) in Python
Mehdi Cherti (Appstat, LAL/CNRS)
Supervised by Balázs Kégl (LAL/CNRS) and
Alexandre Gramfort (CNRS LTCI)
October 26, 2015
Mehdi Cherti (Appstat, LAL/CNRS) Supervised by Balázs Kégl (LAL/CNRS) and Alexandre Gramfort (CNRS LTCI)
Py-Earth: Multivariate Adaptive Regression Splines (
MARS) in Python
October 26, 2015 1 / 25
Slide 3
Slide 3 text
October 26, 2015 2 / 25
Slide 4
Slide 4 text
Introduction
MARS is a regression technique for high dimensional data:
introduced rst by Jerome H.Friedman in 1991
it is non-linear and non-parametric
Py-earth is an implementation of MARS in Python, created by
Jason Crudy
The goal is to bring Py-earth into scikit-learn
October 26, 2015 2 / 25
Slide 5
Slide 5 text
Introduction
MARS is a regression technique for high dimensional data:
introduced rst by Jerome H.Friedman in 1991
it is non-linear and non-parametric
Py-earth is an implementation of MARS in Python, created by
Jason Crudy
The goal is to bring Py-earth into scikit-learn
October 26, 2015 2 / 25
Slide 6
Slide 6 text
Introduction
MARS is a regression technique for high dimensional data:
introduced rst by Jerome H.Friedman in 1991
it is non-linear and non-parametric
Py-earth is an implementation of MARS in Python, created by
Jason Crudy
The goal is to bring Py-earth into scikit-learn
October 26, 2015 2 / 25
Slide 7
Slide 7 text
Introduction
MARS is a regression technique for high dimensional data:
introduced rst by Jerome H.Friedman in 1991
it is non-linear and non-parametric
Py-earth is an implementation of MARS in Python, created by
Jason Crudy
The goal is to bring Py-earth into scikit-learn
October 26, 2015 2 / 25
Slide 8
Slide 8 text
Introduction
MARS is a regression technique for high dimensional data:
introduced rst by Jerome H.Friedman in 1991
it is non-linear and non-parametric
Py-earth is an implementation of MARS in Python, created by
Jason Crudy
The goal is to bring Py-earth into scikit-learn
October 26, 2015 2 / 25
Slide 9
Slide 9 text
Setup
Setup : Multivariate regression with multiple outputs : (X
i
, Y
i
) where
i is the ith example
Xi is a vector describing each example i
Yi is a real-valued vector describing the true outputs of the example i
We want to nd a model f which predicts Y from X with low
generalization mean squared error (MSE)
October 26, 2015 3 / 25
Slide 10
Slide 10 text
Setup
Setup : Multivariate regression with multiple outputs : (X
i
, Y
i
) where
i is the ith example
Xi is a vector describing each example i
Yi is a real-valued vector describing the true outputs of the example i
We want to nd a model f which predicts Y from X with low
generalization mean squared error (MSE)
October 26, 2015 3 / 25
Slide 11
Slide 11 text
Setup
Setup : Multivariate regression with multiple outputs : (X
i
, Y
i
) where
i is the ith example
Xi is a vector describing each example i
Yi is a real-valued vector describing the true outputs of the example i
We want to nd a model f which predicts Y from X with low
generalization mean squared error (MSE)
October 26, 2015 3 / 25
Slide 12
Slide 12 text
Setup
Setup : Multivariate regression with multiple outputs : (X
i
, Y
i
) where
i is the ith example
Xi is a vector describing each example i
Yi is a real-valued vector describing the true outputs of the example i
We want to nd a model f which predicts Y from X with low
generalization mean squared error (MSE)
October 26, 2015 3 / 25
Slide 13
Slide 13 text
How does MARS work ?
Basic building block : Hinge functions,
y = max(x − k, 0)
or
y = max(k − x, 0)
October 26, 2015 4 / 25
Slide 14
Slide 14 text
How does MARS work ?
The algorithm constructs adaptively a set of basis functions : B
k
(X)
Each basis function is a product of hinge functions, for instance :
Bk(X) = max(X1 − 5, 0)max(X2 + 4, 0)
The model is a linear combination of those basis functions
Y =
K
k=1
αkBk(X)
The specicity of MARS comes from how the basis functions are
created and added to the model
October 26, 2015 5 / 25
Slide 15
Slide 15 text
How does MARS work ?
The algorithm constructs adaptively a set of basis functions : B
k
(X)
Each basis function is a product of hinge functions, for instance :
Bk(X) = max(X1 − 5, 0)max(X2 + 4, 0)
The model is a linear combination of those basis functions
Y =
K
k=1
αkBk(X)
The specicity of MARS comes from how the basis functions are
created and added to the model
October 26, 2015 5 / 25
Slide 16
Slide 16 text
How does MARS work ?
The algorithm constructs adaptively a set of basis functions : B
k
(X)
Each basis function is a product of hinge functions, for instance :
Bk(X) = max(X1 − 5, 0)max(X2 + 4, 0)
The model is a linear combination of those basis functions
Y =
K
k=1
αkBk(X)
The specicity of MARS comes from how the basis functions are
created and added to the model
October 26, 2015 5 / 25
Slide 17
Slide 17 text
How does MARS work ?
The algorithm constructs adaptively a set of basis functions : B
k
(X)
Each basis function is a product of hinge functions, for instance :
Bk(X) = max(X1 − 5, 0)max(X2 + 4, 0)
The model is a linear combination of those basis functions
Y =
K
k=1
αkBk(X)
The specicity of MARS comes from how the basis functions are
created and added to the model
October 26, 2015 5 / 25
Slide 18
Slide 18 text
How does MARS work ?
The algorithm constructs adaptively a set of basis functions : B
k
(X)
Each basis function is a product of hinge functions, for instance :
Bk(X) = max(X1 − 5, 0)max(X2 + 4, 0)
The model is a linear combination of those basis functions
Y =
K
k=1
αkBk(X)
The specicity of MARS comes from how the basis functions are
created and added to the model
October 26, 2015 5 / 25
Slide 19
Slide 19 text
How does MARS work ?
The algorithm constructs adaptively a set of basis functions : B
k
(X)
Each basis function is a product of hinge functions, for instance :
Bk(X) = max(X1 − 5, 0)max(X2 + 4, 0)
The model is a linear combination of those basis functions
Y =
K
k=1
αkBk(X)
The specicity of MARS comes from how the basis functions are
created and added to the model
October 26, 2015 5 / 25
Slide 20
Slide 20 text
How does MARS work ?
Two steps, the forward pass and the pruning pass
We over-generate a set of basis functions in the forward pass
We prune unncessary basis functions in the pruning pass
October 26, 2015 6 / 25
Slide 21
Slide 21 text
How does MARS work ?
Two steps, the forward pass and the pruning pass
We over-generate a set of basis functions in the forward pass
We prune unncessary basis functions in the pruning pass
October 26, 2015 6 / 25
Slide 22
Slide 22 text
How does MARS work ?
Two steps, the forward pass and the pruning pass
We over-generate a set of basis functions in the forward pass
We prune unncessary basis functions in the pruning pass
October 26, 2015 6 / 25
Slide 23
Slide 23 text
How does MARS work ? The forward pass
October 26, 2015 7 / 25
Slide 24
Slide 24 text
How does MARS work ? The forward pass
October 26, 2015 8 / 25
Slide 25
Slide 25 text
How does MARS work ? The forward pass
October 26, 2015 9 / 25
Slide 26
Slide 26 text
How does MARS work ? The forward pass
October 26, 2015 10 / 25
Slide 27
Slide 27 text
How does MARS work ? The forward pass
y = (α0 + α1
max(x1 − 3, 0) + α2
max(3 − x1, 0)+
α3
max(x1 − 3, 0)max(x2 − 7, 0) + α4
max(x1 − 3, 0)max(7 − x2, 0)+
α5
max(x2 − 12, 0) + α6
max(12 − x2, 0))
(1)
October 26, 2015 11 / 25
Slide 28
Slide 28 text
What have been done ?
Github repo : https://github.com/jcrudy/py-earth
git clone https://github.com/jcrudy/py-earth
cd py-earth
python setup.py install
The state of the code:
Py-earth supported already a lot of features and the important parts
were there
However, it was not ready to be deployable to
scikit-learn
it was not supporting
multiple outputs
October 26, 2015 12 / 25
Slide 29
Slide 29 text
What have been done ?
Github repo : https://github.com/jcrudy/py-earth
git clone https://github.com/jcrudy/py-earth
cd py-earth
python setup.py install
The state of the code:
Py-earth supported already a lot of features and the important parts
were there
However, it was not ready to be deployable to
scikit-learn
it was not supporting
multiple outputs
October 26, 2015 12 / 25
Slide 30
Slide 30 text
What have been done ? improve code quality
Clean the code (pep8) and adapt it to coding guidelines of
scikit-learn
Enchance documentation
Add more unit tests
October 26, 2015 13 / 25
Slide 31
Slide 31 text
What have been done ? improve code quality
Clean the code (pep8) and adapt it to coding guidelines of
scikit-learn
Enchance documentation
Add more unit tests
October 26, 2015 13 / 25
Slide 32
Slide 32 text
What have been done ? improve code quality
Clean the code (pep8) and adapt it to coding guidelines of
scikit-learn
Enchance documentation
Add more unit tests
October 26, 2015 13 / 25
Slide 33
Slide 33 text
What have been done ? new features
Support for multiple outputs
Support for output weights
Support of estimation of variable importance
Implement FastMARS (Jerome H.Friedman, 1993)
October 26, 2015 14 / 25
Slide 34
Slide 34 text
What have been done ? new features
Support for multiple outputs
Support for output weights
Support of estimation of variable importance
Implement FastMARS (Jerome H.Friedman, 1993)
October 26, 2015 14 / 25
Slide 35
Slide 35 text
What have been done ? new features
Support for multiple outputs
Support for output weights
Support of estimation of variable importance
Implement FastMARS (Jerome H.Friedman, 1993)
October 26, 2015 14 / 25
Slide 36
Slide 36 text
What have been done ? new features
Support for multiple outputs
Support for output weights
Support of estimation of variable importance
Implement FastMARS (Jerome H.Friedman, 1993)
October 26, 2015 14 / 25
Slide 37
Slide 37 text
Example : 1D example
October 26, 2015 15 / 25
Slide 38
Slide 38 text
Example : 2D example
October 26, 2015 16 / 25
Slide 39
Slide 39 text
Example : multiple outputs
October 26, 2015 17 / 25
Slide 40
Slide 40 text
Example : multiple outputs
20 inputs, 3 outputs, only 2 informative inputs, the rest is noise
October 26, 2015 18 / 25
Slide 41
Slide 41 text
Example : multiple outputs
The graph (a crop of it) of basis functions looks like this :
October 26, 2015 19 / 25
Slide 42
Slide 42 text
Example : variable importance
October 26, 2015 20 / 25
Slide 43
Slide 43 text
Example : variable importance
y = sin(πx0
x1) + 20(x2 − 0.5)2 + 10x3 + 5x4 + 5 ∗ N(0, 1)
The code in the previous slide gives:
October 26, 2015 21 / 25
Slide 44
Slide 44 text
Example : FastMARS
October 26, 2015 22 / 25
Slide 45
Slide 45 text
Example : FastMARS
Top : Normal, Bottom : FastMARS
October 26, 2015 23 / 25
Slide 46
Slide 46 text
Future
Close current issues, keep working on code quality to merge it into
scikit-learn
Still, some features are missing, new features:
Deal with missing values
Support categorical variables
October 26, 2015 24 / 25
Slide 47
Slide 47 text
Future
Close current issues, keep working on code quality to merge it into
scikit-learn
Still, some features are missing, new features:
Deal with missing values
Support categorical variables
October 26, 2015 24 / 25
Slide 48
Slide 48 text
Future
Close current issues, keep working on code quality to merge it into
scikit-learn
Still, some features are missing, new features:
Deal with missing values
Support categorical variables
October 26, 2015 24 / 25
Slide 49
Slide 49 text
Thank you
Thank you for listening
October 26, 2015 25 / 25