Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PyData Meetup Group Presentation
Search
Jason Rudy
May 29, 2013
Programming
2
780
PyData Meetup Group Presentation
Presentation on py-earth to the San Francisco PyData Meetup group on 2013-05-29.
Jason Rudy
May 29, 2013
Tweet
Share
Other Decks in Programming
See All in Programming
採用事例の少ないSvelteを選んだ理由と それを正解にするためにやっていること
oekazuma
2
1k
LLM Supervised Fine-tuningの理論と実践
datanalyticslabo
4
1.1k
クリエイティブコーディングとRuby学習 / Creative Coding and Learning Ruby
chobishiba
0
3.9k
Webエンジニア主体のモバイルチームの 生産性を高く保つためにやったこと
igreenwood
0
330
Jakarta EE meets AI
ivargrimstad
0
240
MCP with Cloudflare Workers
yusukebe
2
220
PHPで学ぶプログラミングの教訓 / Lessons in Programming Learned through PHP
nrslib
2
170
Cloudflare MCP ServerでClaude Desktop からWeb APIを構築
kutakutat
1
540
開発者とQAの越境で自動テストが増える開発プロセスを実現する
92thunder
1
180
tidymodelsによるtidyな生存時間解析 / Japan.R2024
dropout009
1
770
暇に任せてProxmoxコンソール 作ってみました
karugamo
1
720
The Efficiency Paradox and How to Save Yourself and the World
hollycummins
1
440
Featured
See All Featured
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
159
15k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
44
9.3k
Adopting Sorbet at Scale
ufuk
73
9.1k
Building Flexible Design Systems
yeseniaperezcruz
327
38k
The Cost Of JavaScript in 2023
addyosmani
45
7k
Producing Creativity
orderedlist
PRO
341
39k
Into the Great Unknown - MozCon
thekraken
33
1.5k
The Pragmatic Product Professional
lauravandoore
32
6.3k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
6
510
Code Review Best Practice
trishagee
65
17k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
365
25k
Testing 201, or: Great Expectations
jmmastey
40
7.1k
Transcript
MARS in Python or A Tale of Two Planets 1
Outline • Motivating use case • MARS algorithm • Py-earth
• Examples 2
3
4
M A R S ultivariate daptive egression plines 5
Not MARS •MARSplines •MARegressionSplines •ARES •earth 6
7
HbA1c Age Gender Etc. Cost X X X X X
X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 8
Constraints •Non-monotone relationships among variables •Interactions among predictors •Simple model
9
10
11
Illustration by Yi-Ke Peng 12
13
Python R My Brain Raw data processing Object-relational mapping Feature
extraction Plotting Bootstrapping Normalization Multivariate Adaptive Regression Splines 14
15
16
Regression: The search for f(x) yj = f ( x1j,
. . . , xnj) + ✏j 17
Linear Regression ˆ f ( x ) = a0 +
P X i=1 aixi 18
Multivariate Adaptive Regression Splines ˆ f ( x ) =
a0 + M X m=1 am Km Y k=1 ⇥ skm xv(k,m) tkm ⇤ + 19
Hinge Functions CDify h ( x t ) = [
x t ]+ = ( x t, x > t 0 , x t 20
Multivariate Adaptive Regression Splines ˆ f ( x ) =
a0 + M X m=1 am Km Y k=1 ⇥ skm xv(k,m) tkm ⇤ + 21
y = 1 2h (1 x ) + 1 2
h ( x 1) Multivariate Adaptive Regression Splines 22
Multivariate Adaptive Regression Splines y = h ( x 1)
h ( x 1) + h (1 x ) h (1 x ) 23
Multivariate Adaptive Regression Splines y = 2 + 0 .
1h ( x 1) + h (1 x ) + 3h ( x 1) h (4 x ) 24
Multivariate Example z = h ( 3 x ) +
h ( 3 x ) h (5 y ) 25
Multivariate Adaptive Regression Splines ˆ f ( x ) =
a0 + M X m=1 am Km Y k=1 ⇥ skm xv(k,m) tkm ⇤ + 26
Forward Pass Pruning Pass 27
Forward Pass • while True: • best_err = Infinity •
for each term, predictor, knot candidate: • err = get_squared_error(term, predictor, knot) • if err < best_err: • best_err = err • best_term, best_pred, best_knot = term, predictor, knot • add term pair for best_term, best_pred, best_knot • check stopping conditions 28
Forward Pass 1 Start Iteration 1 Iteration 2 h( x
t ) h( t x ) h( x t ) ⇥ h ( x s ) h( x t ) ⇥ h ( s x ) 29
Forward Pass • while True: • best_err = Infinity •
for each term, predictor, knot candidate: • err = get_squared_error(term, predictor, knot) • if err < best_err: • best_err = err • best_term, best_pred, best_knot = term, predictor, knot • add term pair for best_term, best_pred, best_knot • check stopping conditions 30
O N2P3 31
Forward Pass 1 Start Iteration 1 Iteration 2 h( x
t ) h( t x ) h( x t ) ⇥ h ( x s ) h( x t ) ⇥ h ( s x ) 32
Generalized Cross Validation GCV = 1 N PN i=1 [yi
ˆ yi]2 1 N2 (N Q d (Q 1))2 33
Pruning Pass • for i in range(num_terms): • best_score =
Infinity • for term in terms: • score = GCV(model \ term) • if score < best_score: • best_score = score • term_to_drop = term • remove term_to_drop from model • models[i] = model.copy() • scores[i] = score • selected_model = models[argmin(scores)] 34
Pruning Pass 1 h( x t ) h( t x
) h( x t ) ⇥ h ( x s ) h( x t ) ⇥ h ( s x ) 35
Final Model [yi ˆ yi]2 d(Q 1))2 y = a0
+ a1 h ( t x ) + a2 h ( x t ) h ( x s ) 36
37
Implementation Goals •Compatible with numpy ecosystem •Fast and reliable •Easy
to maintain 38
39
40
>git clone git://github.com/jcrudy/py-earth.git >cd py-earth >sudo python setup.py install Installation
41
Important Earth Methods •fit(X,y) •transform(X) •predict(X) 42
Simple Example 43
Simple Example 44
45
With Pandas 46
With Patsy 47
Classification 48
Classification 49
50
Future Plans •Documentation •Integrate into scikit-learn •Multiple responses •Sample weights
51
Summary • MARS is a simple but flexible regression method
• py-earth is MARS for Python data stack • Try it! 52
py-earth A far better thing than I have ever done
• https://github.com/jcrudy/py-earth 53