Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PyCon India - Commodity Machine Learning; past, present and future

Andreas Mueller
September 25, 2016
2.4k

PyCon India - Commodity Machine Learning; past, present and future

PyCon India 2016 keynote

Andreas Mueller

September 25, 2016
Tweet

Transcript

  1. Commodity Machine Learning
    Past, present and future
    Andreas Mueller

    View Slide

  2. What is machine learning?

    View Slide

  3. Automatic Decision Making
    Spam?
    Yes No

    View Slide

  4. Spam?
    Yes No

    View Slide

  5. Programming
    Machine Learning

    View Slide

  6. Machine learning is EVERYWHERE

    View Slide

  7. View Slide

  8. View Slide

  9. View Slide

  10. Science
    Engineering
    Medicine
    ...

    View Slide

  11. Commodity machine learning

    View Slide

  12. past

    View Slide

  13. +

    View Slide

  14. View Slide

  15. dawn of open source tools...

    View Slide

  16. The age of shell

    View Slide

  17. Documentation? Testing?

    View Slide

  18. Scikit-learn: User centric machine learning

    View Slide

  19. .fit(X, y)
    .predict(X)
    .transform(X)

    View Slide

  20. present

    View Slide

  21. Choose your ecosystem.

    View Slide

  22. Open! Documented! Tested!

    View Slide

  23. Usability is key!

    View Slide

  24. ML Frameworks
    PyMC, Edward, Stan
    theano, tensorflow, keras

    View Slide

  25. View Slide

  26. from sklearn.model_selection import GridSearchCV
    from sklearn.pipeline import Pipeline

    View Slide

  27. github.com/scikit­learn­contrib/scikit­learn­contrib

    View Slide

  28. (near) Future

    View Slide

  29. pip install scikit­learn==0.18rc2
    0.18
    for the release candidate:

    View Slide

  30. sklearn.cross_validation
    sklearn.grid_search
    sklearn.learning_curve
    sklearn.model_selection

    View Slide

  31. results = pd.DataFrame(grid_search.results_)

    View Slide

  32. labels → groups
    n_folds → n_splits

    View Slide

  33. from sklearn.cross_validation import KFold
    cv = KFold(n_samples, n_folds)
    for train, test in cv:
    ...
    from sklearn.model_selection import KFold
    cv = KFold(n_folds)
    for train, test in cv.split(X, y):
    ...

    View Slide

  34. from sklearn.mixture import GaussianMixture
    from sklearn.mixture import BayesianGaussianMixture

    View Slide

  35. PCA()
    RandomizedPCA()
    PCA()

    View Slide

  36. Gaussian Process Rewrite

    View Slide

  37. Isolation Forests

    View Slide

  38. Play
    from sklearn.neural_network import MLPClassifier
    Work
    import keras

    View Slide

  39. pipe = Pipeline([('preprocessing', StandardScaler()),
    ('classifier', SVC())])
    param_grid = {'preprocessing': [StandardScaler(), None]}
    grid = GridSearchCV(pipe, param_grid)

    View Slide

  40. 40

    View Slide

  41. (further) Future

    View Slide

  42. Feature / Column names

    View Slide

  43. from __future__ import sklearn.plotting

    View Slide

  44. from __future__ import AutoClassifier

    View Slide

  45. More Transparency

    View Slide

  46. amueller.github.io
    @amuellerml
    @amueller
    [email protected]

    View Slide