Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Attribute - Python library for Neural Network Interpretability

Attribute - Python library for Neural Network Interpretability

Final capstone presentation for Attribute - a python library for easy to use methods to interpret a TensorFlow or Pytorch based Neural Network.

Praveen Sridhar

May 15, 2020
Tweet

More Decks by Praveen Sridhar

Other Decks in Programming

Transcript

  1. Attribute :
    Neural Network Interpretability
    in Pytorch and TensorFlow
    Plaksha Tech Leaders Fellowship Capstone Project
    Praveen Sridhar
    Mentored by
    Nikhil Narayan,
    Lead Data Scientist, mfine.co

    View full-size slide

  2. Introduction
    Predictions of Deep Learning models are difficult to explain and interpret
    Regulated industries like medical and finance industries often fall short
    from using Deep learning techniques due to this.
    Techniques for Interpretability help users to :
    ● Trust and understand why predictions are made in a certain way
    ● Provides accountability for usage in important decision making

    View full-size slide

  3. Interpretability in the Medical Industry
    [Mukundhan et al. Google Research]

    View full-size slide

  4. Objective
    1.
    Research and implement
    algorithms for Neural Network
    Interpretability in the domain of
    Computer Vision

    View full-size slide

  5. Literature Review

    View full-size slide

  6. Pixel-space Attribution
    [Srinivas et al. NIPS 2019]

    View full-size slide

  7. Saliency Heatmaps
    Grad CAM Full Grad Integrated Gradients
    Original Image
    [Srinivas et al. NIPS 2019]

    View full-size slide

  8. Neuron Visualization
    Gradient Ascent Integrated Gradients
    SUMMIT
    [Hohman et al.]

    View full-size slide

  9. Implementation

    View full-size slide

  10. Attribute
    Python Library for Neural
    Network Interpretability in
    PyTorch and TensorFlow

    View full-size slide

  11. Design
    ● Unified API for different
    types of techniques
    ● Benchmarking for
    comparisons

    View full-size slide

  12. Details of Algorithms implemented

    View full-size slide

  13. Gradient Attribution
    CEO
    In this, the gradient of output with
    respect to input image is taken as
    the attribution map

    View full-size slide

  14. Smooth Gradients
    CEO
    Smoothed version of gradient attribution. Random noise is added to the input
    image and averaged to get the final attribution

    View full-size slide

  15. Integrated Gradients
    CEO

    View full-size slide

  16. Integrated Gradients (continued)
    CEO

    View full-size slide

  17. GradCAM
    CEO
    Generalization of CAM (Class Activation Maps) to arbitrary architectures.
    CAM works only for networks with last layer as an Global Average Pooling
    layer
    Alpha = Weights[:, class_index] # (512,)
    FeatureMaps = getLastConvLayer() # (7,7,512)
    CAM = Alpha * FeatureMaps # (7,7)
    Upsample to original image size and overlay
    In Grad CAM, the equivalent to global average pooling is performed on the
    gradients of the output with respect to the feature maps Aij

    View full-size slide

  18. FullGrad
    CEO
    FullGrad saliency uses a decomposition of the neural network into input
    sensitivity and neuron sensitivity components to give an exact representation
    of attribution

    View full-size slide

  19. FullGrad
    CEO
    Official implementation
    requires modification of
    the network definition to
    get intermediate
    gradients and biases

    View full-size slide

  20. FullGrad
    CEO
    In the attribute library, the
    implementation is flexible
    and automatically extracts
    the required intermediate
    gradients and bias values
    using internal PyTorch
    APIs

    View full-size slide

  21. Sample Outputs

    View full-size slide

  22. Sample Outputs
    GradCAM

    View full-size slide

  23. Conclusion
    The required Neural Network Library was implemented in both PyTorch
    and TensorFlow supporting the following algorithms
    ● Gradient Attribution
    ● Integrated Gradients
    ● Smoothed Gradients
    ● GradCAM
    ● FullGrad
    The library features a consistent API across different techniques as well
    as a benchmarking utility

    View full-size slide

  24. Future Scope
    The project can be expanded to
    ● More techniques for interpretability, especially neuron visualisation
    ● Support other modalities like Text and Speech
    A possible line of research was found while testing FullGrad technique,
    where it was discovered that heat maps produced were not very class
    discriminative. A combination of ideas from GradCAM could potentially
    solve this.

    View full-size slide