Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PyData Meetup Group Presentation
Search
Jason Rudy
May 29, 2013
Programming
2
810
PyData Meetup Group Presentation
Presentation on py-earth to the San Francisco PyData Meetup group on 2013-05-29.
Jason Rudy
May 29, 2013
Tweet
Share
Other Decks in Programming
See All in Programming
CSC509 Lecture 04
javiergs
PRO
0
300
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
290
iOSエンジニア向けの英語学習アプリを作る!
yukawashouhei
0
200
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
480
iOSエンジニア向けの英語学習アプリを作る!
yukawashouhei
0
200
Catch Up: Go Style Guide Update
andpad
0
230
bootcamp2025_バックエンド研修_WebAPIサーバ作成.pdf
geniee_inc
0
120
When Dependencies Fail: Building Antifragile Applications in a Fragile World
selcukusta
0
100
Leading Effective Engineering Teams in the AI Era
addyosmani
7
490
Software Architecture
hschwentner
6
2.3k
Go Conference 2025: Goで体感するMultipath TCP ― Go 1.24 時代の MPTCP Listener を理解する
takehaya
9
1.7k
他言語経験者が Golangci-lint を最初のコーディングメンターにした話 / How Golangci-lint Became My First Coding Mentor: A Story from a Polyglot Programmer
uma31
0
290
Featured
See All Featured
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
How to Ace a Technical Interview
jacobian
280
24k
The Cost Of JavaScript in 2023
addyosmani
55
9k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
48
9.7k
A Tale of Four Properties
chriscoyier
161
23k
Why Our Code Smells
bkeepers
PRO
340
57k
Docker and Python
trallard
46
3.6k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.5k
The Illustrated Children's Guide to Kubernetes
chrisshort
49
51k
Documentation Writing (for coders)
carmenintech
75
5.1k
Facilitating Awesome Meetings
lara
56
6.6k
A better future with KSS
kneath
239
18k
Transcript
MARS in Python or A Tale of Two Planets 1
Outline • Motivating use case • MARS algorithm • Py-earth
• Examples 2
3
4
M A R S ultivariate daptive egression plines 5
Not MARS •MARSplines •MARegressionSplines •ARES •earth 6
7
HbA1c Age Gender Etc. Cost X X X X X
X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 8
Constraints •Non-monotone relationships among variables •Interactions among predictors •Simple model
9
10
11
Illustration by Yi-Ke Peng 12
13
Python R My Brain Raw data processing Object-relational mapping Feature
extraction Plotting Bootstrapping Normalization Multivariate Adaptive Regression Splines 14
15
16
Regression: The search for f(x) yj = f ( x1j,
. . . , xnj) + ✏j 17
Linear Regression ˆ f ( x ) = a0 +
P X i=1 aixi 18
Multivariate Adaptive Regression Splines ˆ f ( x ) =
a0 + M X m=1 am Km Y k=1 ⇥ skm xv(k,m) tkm ⇤ + 19
Hinge Functions CDify h ( x t ) = [
x t ]+ = ( x t, x > t 0 , x t 20
Multivariate Adaptive Regression Splines ˆ f ( x ) =
a0 + M X m=1 am Km Y k=1 ⇥ skm xv(k,m) tkm ⇤ + 21
y = 1 2h (1 x ) + 1 2
h ( x 1) Multivariate Adaptive Regression Splines 22
Multivariate Adaptive Regression Splines y = h ( x 1)
h ( x 1) + h (1 x ) h (1 x ) 23
Multivariate Adaptive Regression Splines y = 2 + 0 .
1h ( x 1) + h (1 x ) + 3h ( x 1) h (4 x ) 24
Multivariate Example z = h ( 3 x ) +
h ( 3 x ) h (5 y ) 25
Multivariate Adaptive Regression Splines ˆ f ( x ) =
a0 + M X m=1 am Km Y k=1 ⇥ skm xv(k,m) tkm ⇤ + 26
Forward Pass Pruning Pass 27
Forward Pass • while True: • best_err = Infinity •
for each term, predictor, knot candidate: • err = get_squared_error(term, predictor, knot) • if err < best_err: • best_err = err • best_term, best_pred, best_knot = term, predictor, knot • add term pair for best_term, best_pred, best_knot • check stopping conditions 28
Forward Pass 1 Start Iteration 1 Iteration 2 h( x
t ) h( t x ) h( x t ) ⇥ h ( x s ) h( x t ) ⇥ h ( s x ) 29
Forward Pass • while True: • best_err = Infinity •
for each term, predictor, knot candidate: • err = get_squared_error(term, predictor, knot) • if err < best_err: • best_err = err • best_term, best_pred, best_knot = term, predictor, knot • add term pair for best_term, best_pred, best_knot • check stopping conditions 30
O N2P3 31
Forward Pass 1 Start Iteration 1 Iteration 2 h( x
t ) h( t x ) h( x t ) ⇥ h ( x s ) h( x t ) ⇥ h ( s x ) 32
Generalized Cross Validation GCV = 1 N PN i=1 [yi
ˆ yi]2 1 N2 (N Q d (Q 1))2 33
Pruning Pass • for i in range(num_terms): • best_score =
Infinity • for term in terms: • score = GCV(model \ term) • if score < best_score: • best_score = score • term_to_drop = term • remove term_to_drop from model • models[i] = model.copy() • scores[i] = score • selected_model = models[argmin(scores)] 34
Pruning Pass 1 h( x t ) h( t x
) h( x t ) ⇥ h ( x s ) h( x t ) ⇥ h ( s x ) 35
Final Model [yi ˆ yi]2 d(Q 1))2 y = a0
+ a1 h ( t x ) + a2 h ( x t ) h ( x s ) 36
37
Implementation Goals •Compatible with numpy ecosystem •Fast and reliable •Easy
to maintain 38
39
40
>git clone git://github.com/jcrudy/py-earth.git >cd py-earth >sudo python setup.py install Installation
41
Important Earth Methods •fit(X,y) •transform(X) •predict(X) 42
Simple Example 43
Simple Example 44
45
With Pandas 46
With Patsy 47
Classification 48
Classification 49
50
Future Plans •Documentation •Integrate into scikit-learn •Multiple responses •Sample weights
51
Summary • MARS is a simple but flexible regression method
• py-earth is MARS for Python data stack • Try it! 52
py-earth A far better thing than I have ever done
• https://github.com/jcrudy/py-earth 53