Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Liz Sander - Software Library APIs: Lessons Lea...

Liz Sander - Software Library APIs: Lessons Learned from scikit-learn

When you think of an API, you’re probably thinking about a web service. But it’s important to think about your developer interface when designing a software library as well! I’ll talk about the scikit-learn package, and how its API makes it easy to construct complex models from simple building blocks, using three basic pieces: transformers, estimators, and meta-estimators. Then I’ll show how this interface enabled us to construct our own meta-estimator for model stacking. This will demonstrate how to implement new modeling techniques in a scikit-learn style, and more generally, the value of writing libraries with the developer interface in mind.

https://us.pycon.org/2018/schedule/presentation/76/

PyCon 2018

May 11, 2018
Tweet

More Decks by PyCon 2018

Other Decks in Programming

Transcript

  1. Software Library APIs: Lessons Learned from scikit-learn Liz Sander, Data

    Scientist, Civis Analytics GitHub: elsander @sander_liz
  2. :(

  3. :)

  4. ◉ Think of an API as the “developer interface” (as

    opposed to the user interface) APIs are for software too! Code Developer input Output for developer
  5. ◉ Stable ◉ Integrates with existing tools ◉ Intuitive ◉

    Flexible/extendable What makes a good API?
  6. Software libraries have APIs. It’s worth some upfront time to

    make them useful. Let’s look at a library that does it well!
  7. ◉ Stable ◉ Integrates with existing tools ◉ Intuitive ◉

    Flexible/extendable What makes a good API?
  8. ◉ Stable ◉ Integrates with existing tools ◉ Intuitive ◉

    Flexible/extendable What makes a good API?
  9. ◉ Stable ◉ Integrates with existing tools ◉ Intuitive ◉

    Flexible/extendable What makes a good API?
  10. ◉ Stable ◉ Integrates with existing tools ◉ Intuitive ◉

    Flexible/extendable What makes a good API?
  11. How do I write a function/class for logistic regression? …

    what about a random forest? … and a neural network?
  12. How do I write a class for logistic regression? …

    what about a random forest? … and a neural network? How do I create a general framework for modeling?
  13. ◉ Stable ◉ Integrates with existing tools ◉ Intuitive ◉

    Flexible/extendable What makes a good API?
  14. ◉ xgboost, keras, lightning ◉ Civis-maintained ◦ python-glmnet (R wrapper)

    ◦ civisml-extensions ◦ muffnn ◉ Scikit-learn maintains a list of many others Scikit-learn extensions
  15. Train base estimators using {x i , y i }

    Logistic Meta-estimator (Logistic) GBT Random Forest Train data x i Test data x j
  16. Predict base estimators on {x j } Logistic Meta-estimator (Logistic)

    GBT Random Forest Train data x i Test data x j
  17. Conclusion ◉ Make your API clear and consistent ◉ Find

    an abstraction that mirrors your mental model ◉ Think about developers as users