Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
A Benchmark of Open Source Tools for Machine Le...
Search
szilard
July 02, 2017
1
330
A Benchmark of Open Source Tools for Machine Learning from R - UseR! 2017 Conference - Brussels, July, 2007
szilard
July 02, 2017
Tweet
Share
More Decks by szilard
See All by szilard
Gradient Boosting Machines (GBM): From Zero to Hero (with R and Python Code) - Data Con LA - Oct 2020
szilard
0
210
Make Machine Learning Boring Again: Best Practices for Using Machine Learning in Businesses - Albuquerque Machine Learning Meetup (Online) - Aug 2020
szilard
0
160
Better than Deep Learning: Gradient Boosting Machines (GBM) - eRum conference - invited talk - June 2020
szilard
0
140
Gradient Boosting Machines (GBM): From Zero to Hero (with R and Python Code) - LA Data Science Meetup - February 2020
szilard
0
130
A Random Walk in Data Science and Machine Learning in Practice - CEU, Business Analytics Masters - Budapest, Febr 2020
szilard
0
320
Better than My Meetup/Conference Talks: Going Deeper in Various GBM Topics - GBM Advanced Workshop - Budapest, Nov 2019
szilard
0
97
Gradient Boosting Machines (GBM): From Zero to Hero (with R and Python Code) - Budapest BI Forum, Budapest, Nov 2019
szilard
0
150
Make Machine Learning Boring Again: Best Practices for Using Machine Learning in Businesses - LA Data Science Meetup - Playa Vista, August 2019
szilard
0
140
Better than Deep Learning: Gradient Boosting Machines (GBM) / 2019 edition - Budapest R and Data Science Meetups - Budapest, June 2019
szilard
0
120
Featured
See All Featured
30 Presentation Tips
portentint
PRO
1
220
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
Leadership Guide Workshop - DevTernity 2021
reverentgeek
1
200
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
230
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
1
110
Testing 201, or: Great Expectations
jmmastey
46
8.1k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
62
50k
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
750
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.7k
How to Align SEO within the Product Triangle To Get Buy-In & Support - #RIMC
aleyda
1
1.4k
Transcript
A Benchmark of Open Source Tools for Machine Learning from
R Szilárd Pafka, PhD Chief Scientist, Epoch useR! 2017 Conference Brussels, July 2017
None
Disclaimer: I am not representing my employer (Epoch) in this
talk I cannot confirm nor deny if Epoch is using any of the methods, tools, results etc. mentioned in this talk
None
None
None
None
None
None
None
binary classification, 10M records numeric & categorical features, non-sparse
http://www.cs.cornell.edu/~alexn/papers/empirical.icml06.pdf http://lowrank.net/nikos/pubs/empirical.pdf
http://www.cs.cornell.edu/~alexn/papers/empirical.icml06.pdf http://lowrank.net/nikos/pubs/empirical.pdf
None
None
None
None
EC2
n = 10K, 100K, 1M, 10M, 100M Training time RAM
usage AUC CPU % by core read data, pre-process, score test data
n = 10K, 100K, 1M, 10M, 100M Training time RAM
usage AUC CPU % by core read data, pre-process, score test data
None
None
None
None
None
None
None
10x
None
None
None
None
None
http://datascience.la/benchmarking-random-forest-implementations/#comment-53599
None
None
None
None
None
None
Best linear: 71.1
None
None
learn_rate = 0.1, max_depth = 6, n_trees = 300 learn_rate
= 0.01, max_depth = 16, n_trees = 1000
None
None
None
None
None
None
None
...
None
None
None
None
None
None
None
None
None
None
None
R++
None
None
None
None
None
None
None
None
None
None
None
None