Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyCon India - Commodity Machine Learning; past, present and future

8ffe68e4b19092aab184e4aa09ca4bff?s=47 Andreas Mueller
September 25, 2016
2.3k

PyCon India - Commodity Machine Learning; past, present and future

PyCon India 2016 keynote

8ffe68e4b19092aab184e4aa09ca4bff?s=128

Andreas Mueller

September 25, 2016
Tweet

Transcript

  1. Commodity Machine Learning Past, present and future Andreas Mueller

  2. What is machine learning?

  3. Automatic Decision Making Spam? Yes No

  4. Spam? Yes No

  5. Programming Machine Learning

  6. Machine learning is EVERYWHERE

  7. None
  8. None
  9. None
  10. Science Engineering Medicine ...

  11. Commodity machine learning

  12. past

  13. +

  14. None
  15. dawn of open source tools...

  16. The age of shell

  17. Documentation? Testing?

  18. Scikit-learn: User centric machine learning

  19. .fit(X, y) .predict(X) .transform(X)

  20. present

  21. Choose your ecosystem.

  22. Open! Documented! Tested!

  23. Usability is key!

  24. ML Frameworks PyMC, Edward, Stan theano, tensorflow, keras

  25. None
  26. from sklearn.model_selection import GridSearchCV from sklearn.pipeline import Pipeline

  27. github.com/scikit­learn­contrib/scikit­learn­contrib

  28. (near) Future

  29. pip install scikit­learn==0.18rc2 0.18 for the release candidate:

  30. sklearn.cross_validation sklearn.grid_search sklearn.learning_curve sklearn.model_selection

  31. results = pd.DataFrame(grid_search.results_)

  32. labels → groups n_folds → n_splits

  33. from sklearn.cross_validation import KFold cv = KFold(n_samples, n_folds) for train,

    test in cv: ... from sklearn.model_selection import KFold cv = KFold(n_folds) for train, test in cv.split(X, y): ...
  34. from sklearn.mixture import GaussianMixture from sklearn.mixture import BayesianGaussianMixture

  35. PCA() RandomizedPCA() PCA()

  36. Gaussian Process Rewrite

  37. Isolation Forests

  38. Play from sklearn.neural_network import MLPClassifier Work import keras

  39. pipe = Pipeline([('preprocessing', StandardScaler()), ('classifier', SVC())]) param_grid = {'preprocessing': [StandardScaler(),

    None]} grid = GridSearchCV(pipe, param_grid)
  40. 40

  41. (further) Future

  42. Feature / Column names

  43. from __future__ import sklearn.plotting

  44. from __future__ import AutoClassifier

  45. More Transparency

  46. amueller.github.io @amuellerml @amueller t3kcit@gmail.com