Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PyData Meetup Group Presentation
Search
Jason Rudy
May 29, 2013
Programming
820
2
Share
PyData Meetup Group Presentation
Presentation on py-earth to the San Francisco PyData Meetup group on 2013-05-29.
Jason Rudy
May 29, 2013
Other Decks in Programming
See All in Programming
ユニットテストの先へ:テスト技法で要求・仕様を整理するJava開発実践 / Beyond_Unit_Testing_Practical_Java_Development_Techniques_for_Organizing_Requirements_and_Specifications
shimashima35
0
310
net-httpのHTTP/2対応について
naruse
0
390
さぁV100、メモリをお食べ・・・
nilpe
0
120
Lessons from Spec-Driven Development
simas
PRO
0
110
Migrations : C'est une question d'hygiène !
vinceamstoutz
0
2.5k
Technical Debt: Understanding it Rightly, Engaging it Rightly #LaravelLiveJP
shogogg
0
180
自動レビューエンジンの実装と運用 ~レビューのない世界へ~
kurukuru1999
2
300
プラグインで拡張される Context をtype-safe にする難しさと設計判断
kazupon
2
500
TypeScriptだけでAIエージェントを作る フロント・エージェント・インフラのフルスタック実践
har1101
6
1.2k
GitHub Copilot CLIのいいところ
htkym
2
1.2k
AI駆動開発で崩れていくコードベースを立て直す
kyoko_nr_nr
1
400
RailsTokyo 2026#4: AI様があれば、 Hotwireの弱点は消えるか?
naofumi
5
1k
Featured
See All Featured
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
2
830
[SF Ruby Conf 2025] Rails X
palkan
2
1.1k
Rails Girls Zürich Keynote
gr2m
96
14k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2.2k
How to Ace a Technical Interview
jacobian
281
24k
Noah Learner - AI + Me: how we built a GSC Bulk Export data pipeline
techseoconnect
PRO
0
190
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
3.3k
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
210
Six Lessons from altMBA
skipperchong
29
4.3k
技術選定の審美眼(2025年版) / Understanding the Spiral of Technologies 2025 edition
twada
PRO
118
120k
The AI Search Optimization Roadmap by Aleyda Solis
aleyda
1
5.8k
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
460
Transcript
MARS in Python or A Tale of Two Planets 1
Outline • Motivating use case • MARS algorithm • Py-earth
• Examples 2
3
4
M A R S ultivariate daptive egression plines 5
Not MARS •MARSplines •MARegressionSplines •ARES •earth 6
7
HbA1c Age Gender Etc. Cost X X X X X
X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 8
Constraints •Non-monotone relationships among variables •Interactions among predictors •Simple model
9
10
11
Illustration by Yi-Ke Peng 12
13
Python R My Brain Raw data processing Object-relational mapping Feature
extraction Plotting Bootstrapping Normalization Multivariate Adaptive Regression Splines 14
15
16
Regression: The search for f(x) yj = f ( x1j,
. . . , xnj) + ✏j 17
Linear Regression ˆ f ( x ) = a0 +
P X i=1 aixi 18
Multivariate Adaptive Regression Splines ˆ f ( x ) =
a0 + M X m=1 am Km Y k=1 ⇥ skm xv(k,m) tkm ⇤ + 19
Hinge Functions CDify h ( x t ) = [
x t ]+ = ( x t, x > t 0 , x t 20
Multivariate Adaptive Regression Splines ˆ f ( x ) =
a0 + M X m=1 am Km Y k=1 ⇥ skm xv(k,m) tkm ⇤ + 21
y = 1 2h (1 x ) + 1 2
h ( x 1) Multivariate Adaptive Regression Splines 22
Multivariate Adaptive Regression Splines y = h ( x 1)
h ( x 1) + h (1 x ) h (1 x ) 23
Multivariate Adaptive Regression Splines y = 2 + 0 .
1h ( x 1) + h (1 x ) + 3h ( x 1) h (4 x ) 24
Multivariate Example z = h ( 3 x ) +
h ( 3 x ) h (5 y ) 25
Multivariate Adaptive Regression Splines ˆ f ( x ) =
a0 + M X m=1 am Km Y k=1 ⇥ skm xv(k,m) tkm ⇤ + 26
Forward Pass Pruning Pass 27
Forward Pass • while True: • best_err = Infinity •
for each term, predictor, knot candidate: • err = get_squared_error(term, predictor, knot) • if err < best_err: • best_err = err • best_term, best_pred, best_knot = term, predictor, knot • add term pair for best_term, best_pred, best_knot • check stopping conditions 28
Forward Pass 1 Start Iteration 1 Iteration 2 h( x
t ) h( t x ) h( x t ) ⇥ h ( x s ) h( x t ) ⇥ h ( s x ) 29
Forward Pass • while True: • best_err = Infinity •
for each term, predictor, knot candidate: • err = get_squared_error(term, predictor, knot) • if err < best_err: • best_err = err • best_term, best_pred, best_knot = term, predictor, knot • add term pair for best_term, best_pred, best_knot • check stopping conditions 30
O N2P3 31
Forward Pass 1 Start Iteration 1 Iteration 2 h( x
t ) h( t x ) h( x t ) ⇥ h ( x s ) h( x t ) ⇥ h ( s x ) 32
Generalized Cross Validation GCV = 1 N PN i=1 [yi
ˆ yi]2 1 N2 (N Q d (Q 1))2 33
Pruning Pass • for i in range(num_terms): • best_score =
Infinity • for term in terms: • score = GCV(model \ term) • if score < best_score: • best_score = score • term_to_drop = term • remove term_to_drop from model • models[i] = model.copy() • scores[i] = score • selected_model = models[argmin(scores)] 34
Pruning Pass 1 h( x t ) h( t x
) h( x t ) ⇥ h ( x s ) h( x t ) ⇥ h ( s x ) 35
Final Model [yi ˆ yi]2 d(Q 1))2 y = a0
+ a1 h ( t x ) + a2 h ( x t ) h ( x s ) 36
37
Implementation Goals •Compatible with numpy ecosystem •Fast and reliable •Easy
to maintain 38
39
40
>git clone git://github.com/jcrudy/py-earth.git >cd py-earth >sudo python setup.py install Installation
41
Important Earth Methods •fit(X,y) •transform(X) •predict(X) 42
Simple Example 43
Simple Example 44
45
With Pandas 46
With Patsy 47
Classification 48
Classification 49
50
Future Plans •Documentation •Integrate into scikit-learn •Multiple responses •Sample weights
51
Summary • MARS is a simple but flexible regression method
• py-earth is MARS for Python data stack • Try it! 52
py-earth A far better thing than I have ever done
• https://github.com/jcrudy/py-earth 53