Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Attribute - Python library for Neural Network Interpretability

Attribute - Python library for Neural Network Interpretability

Final capstone presentation for Attribute - a python library for easy to use methods to interpret a TensorFlow or Pytorch based Neural Network.

Praveen Sridhar

May 15, 2020

More Decks by Praveen Sridhar

Other Decks in Programming


  1. Attribute : Neural Network Interpretability in Pytorch and TensorFlow Plaksha

    Tech Leaders Fellowship Capstone Project Praveen Sridhar Mentored by Nikhil Narayan, Lead Data Scientist, mfine.co
  2. Introduction Predictions of Deep Learning models are difficult to explain

    and interpret Regulated industries like medical and finance industries often fall short from using Deep learning techniques due to this. Techniques for Interpretability help users to : • Trust and understand why predictions are made in a certain way • Provides accountability for usage in important decision making
  3. Gradient Attribution CEO In this, the gradient of output with

    respect to input image is taken as the attribution map
  4. Smooth Gradients CEO Smoothed version of gradient attribution. Random noise

    is added to the input image and averaged to get the final attribution
  5. GradCAM CEO Generalization of CAM (Class Activation Maps) to arbitrary

    architectures. CAM works only for networks with last layer as an Global Average Pooling layer Alpha = Weights[:, class_index] # (512,) FeatureMaps = getLastConvLayer() # (7,7,512) CAM = Alpha * FeatureMaps # (7,7) Upsample to original image size and overlay In Grad CAM, the equivalent to global average pooling is performed on the gradients of the output with respect to the feature maps Aij
  6. FullGrad CEO FullGrad saliency uses a decomposition of the neural

    network into input sensitivity and neuron sensitivity components to give an exact representation of attribution
  7. FullGrad CEO In the attribute library, the implementation is flexible

    and automatically extracts the required intermediate gradients and bias values using internal PyTorch APIs
  8. Conclusion The required Neural Network Library was implemented in both

    PyTorch and TensorFlow supporting the following algorithms • Gradient Attribution • Integrated Gradients • Smoothed Gradients • GradCAM • FullGrad The library features a consistent API across different techniques as well as a benchmarking utility
  9. Future Scope The project can be expanded to • More

    techniques for interpretability, especially neuron visualisation • Support other modalities like Text and Speech A possible line of research was found while testing FullGrad technique, where it was discovered that heat maps produced were not very class discriminative. A combination of ideas from GradCAM could potentially solve this.