A talk on Caffe, Theano, and sklearn-theano


Kyle Kastner

March 13, 2015


  1. Caffe and Theano in sklearn- theano With a side of

    GoogLeNet Kyle Kastner LISA / MILA Universite de Montreal
  2. The Plan Classify Locate ??? Profit

  3. Before ~2012 Fixed Feature Extractors Simple Classifier SURF SIFT SVM

    skimage sklearn HOG linear models Feature Aggregation sklearn Vector Quantization Sparse Coding Spatial Pooling
  4. Trainable Feature Extractors Trainable Classifier Neural Net! Theano caffe Trainable

    Midlevel Features pylearn2 “Deep” Computer vision After ~2012 Neural Net!! Neural Net!!! torch torch blocks torch Lasagne ... Not pictured: - Monster GPUs - CPU clusters - 1.2TB of images
  5. “Standard” Convolutional Nets [r8] conv pool conv pool [1],[2],[3]

  6. Many options for training deep nets... but how can we

    provide “deep learning for everyone?” A solution: sklearn-theano
  7. Scikit-Learn (sklearn) • popular ML library • fast and efficient

    • simple, flat design • intuitive core API • python • open source ◦ pip install sklearn Theano • Optimizing compiler • Math expressions • python • writes like numpy • Automatic diff! • several backends ◦ CPU, GPU, switch with one flag!
  8. sklearn from sklearn.preprocessing import StandardScaler from sklearn.decomposition import PCA from

    sklearn.linear_model import LogisticRegression from sklearn.pipeline import make_pipeline pipeline = make_pipeline(StandardScaler(), PCA(whiten=True, n_components=10), LogisticRegression()) pipeline.fit(X_train, y_train) y_pred = pipeline.predict(X_test)
  9. Theano import theano import theano.tensor as T import numpy.random.randn as

    randn shapes = [(1000, 728), (1000, 1), (1000, 1000), (1000, 1), (10, 1000), (10, 1)] W1, b1, W2, b2, W3, b3 = map(theano.shared, [randn(*shape) for shape in shapes]) def relu(expr): return T.maximum(expr, 0) inp = T.fmatrix() net = T.nnet.softmax(W3.dot(relu(W2.dot(relu(W1.dot(inp) + b1)) + b2)) + b3) f = theano.function([inp], net)
  10. sklearn-theano • Kyle Kastner, Michael Eickenberg • Starting to have

    new contributors! • Easy interface to trained networks • Bring deep nets into sklearn pipelines • Includes examples from this talk https://github.com/sklearn-theano/sklearn-theano
  11. sklearn-theano Trained Feature Extractor Trainable Classifier Canny sklearn Other pretrained

    models OverFeat (NYU)[4] DeCAF (UCB) Canny Not pictured: Weeks of expert time and compute time Cross-validation of entire networks! linear models SVM sklearn- theano
  12. sklearn-theano features GAN

  13. Generative Adversarial Nets • Ian Goodfellow and many others, 2014

    [5] • Trains two networks ◦ Generate examples from random input ◦ Discriminate whether generated or real Generation sklearn_theano/examples/plot_mnist_generator.py
  14. sklearn-theano features OverFeat

  15. sklearn_theano/examples/plot_multiple_localization.py Overfeat • Using names and categories of classes, we

    can search through images! • Categories from WordNet ◦ Cat (cat.n.01) ◦ Dog (dog.n.01) Localization X = load_sample_image("cat_and_dog.jpg") dog_label = 'dog.n.01' cat_label = 'cat.n.01' clf = OverfeatLocalizer(top_n=1, match_strings=[dog_label, cat_label]) points = clf.predict(X) dog_points = points[0] cat_points = points[1] clf = GMM() clf.fit(dog_points) dog_box = convert_gmm_to_box(clf, "darkred", .6) clf.fit(cat_points) cat_box = convert_gmm_to_box(clf, "steelblue", .6)
  16. sklearn-theano features GoogLeNet!

  17. GoogLeNet: Why? • Make it really easy to use out

    of the box with sklearn! • Bring Caffe into the Theano world! • Immediate access to layers • Caffe has a python interface called pycaffe ◦ But not always easy setup X = load_sample_image("sloth_closeup.jpg") top_n_classes = 5 goog_clf = GoogLeNetClassifier(top_n=top_n_classes) over_clf = OverfeatClassifier(top_n=top_n_classes) Overfeat predictions [['otter' 'meerkat, mierkat' 'Border terrier' 'badger' 'three-toed sloth, ai, Bradypus tridactylus']] Overfeat probabilities [[ 2.14521634e-07 1.34061577e-06 7.87416593e-06 1.31268520e-04 9.99858141e-01]] GoogLeNet predictions [['otter' 'African grey, African gray, Psittacus erithacus' 'badger' 'sea lion' 'three-toed sloth, ai, Bradypus tridactylus']] GoogLeNet probabilities [[ 0.0191793 0.02727892 0.03366567 0.04920716 0.76793528]] s)
  18. Key Components • Local Response Normalization ◦ Local Contrast Normalization

    but between layers ◦ Across channels, with certain size ◦ Maybe key piece (batch normalization very similar) • Inception Modules ◦ Parallel convolutions concatenated ◦ Naive version has huge parameter growth ◦ Use 1x1 convolution to project downwards ◦ Can tune to limit number of parameters
  19. None
  20. Adding a new Caffe model • Download a compatible Caffe

    for your model ◦ If you trained it, you already have this! • Compile caffe.proto into caffe_pb2.py ◦ This uses protoc, which is part of Google’s protobuf ◦ Can apt-get install protobuf-compiler ◦ Use protobuf 2.4.x , NOT 3.x • Make a new directory ◦ sklearn-theano/sklearn-theano/models/mymodel
  21. Adding a new Caffe model, cont. • Put caffe_pb2.py in

    the new mymodel folder ◦ Every model has its own caffe_pb2.py • Modify the imports in caffe_pb2.py ◦ Need relative imports to use externals version ◦ ...externals.google instead of google • Try to parse the model using our tools ◦ feature_extraction/caffe/caffemodel.py • Build a new class wrapper ◦ See feature_extraction/caffe/googlenet.py
  22. Adding a new Caffe model, p3 • If all goes

    well, it is done! • If not, raise an issue on GitHub • Specialized nodes need Theano code • Send a PR • Currently testing this workflow with VGGNet • Want to add CIFAR10 NiN • Others?
  23. Takeaways • Scikit-Learn is an amazing ML package (but no

    deep nets) • Theano is a great package for {C,G}PU math in Python • Neural networks are powerful feature extractors • Use large neural nets easily in sklearn-theano ◦ Use them within sklearn pipelines ◦ Use them as building blocks for your own Theano net
  24. Future Work • Loaders for newest state of the art

    convolutional nets • Support other types of inputs, e.g. audio and text • Enable fine tuning to specific tasks • Help us bring powerful feature extractors into the scikit- learn and Theano worlds!
  25. Code: https://github.com/sklearn-theano/sklearn-theano Thank You! Friend us on TweetHub @kastnerkyle @meickenberg

    @kastnerkyle @eickenberg
  26. References [1] http://deeplearning.stanford.edu/wiki/index.php/UFLDL_Tutorial [2] http://www.cs.nyu.edu/~yann/talks/lecun-ranzato-icml2013.pdf [3] http://parse.ele.tue.nl/education/cluster2 [4] P. Sermanet,

    D. Eigen, X. Zhang, M. Mathieu, R. Fergus, Y. LeCun. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks, International Conference on Learning Representations (ICLR 2014), April 2014. http://arxiv.org/abs/1406.2661 [5] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative Adversarial Networks, arXiV, June 2014. http://arxiv.org/abs/1406.2661 [6] Going Deeper With Convolutions, C. Szegedy