Interpretability of Machine Learning : Paper reading (LIME)

1 KYOTO UNIVERSITY KYOTO UNIVERSITY Interpretability of Machine Learning Daiki
Tanaka Kashima lab., Kyoto University Paper Reading Seminar, 2018/7/6(Fri)

2 KYOTO UNIVERSITY Background: What is interpretability? n Interpretation is
the process of giving explanations to humans. n There is no clear definition, but it is l To explain not only output of models (what) but also the reason (why). l To interpret the model by using some method or criteria.

3 KYOTO UNIVERSITY Background: What can we do by interpreting
the model? n We can make Fair and Ethical decision. l (i.e.) Medical diagnosis, Terrorism detection n We can trust the model prediction. n We can generate hypotheses. l (i.e.) a simple regression model might reveal a strong association between thalidomide use and birth defects or smoking and lung cancer (Wang et al., 1999) Wang et al. Smoking and the occurence of alzheimer’s disease: Cross-sectional and longitudinal data in a population-based study. American journal of epidemiology, 1999.

4 KYOTO UNIVERSITY Background: Approach to interpreting 1. Select interpretable
model. l Rule-base method, decision tree, linear regression model… Sometimes, we can’t get enough accuracy by doing so. 2. Use some method to interpret.

5 KYOTO UNIVERSITY Background: two ways of Interpreting n Global
interpretability l Interpreting the whole tendency of model or data. l Sometimes not accurate locally. n Local interpretability l Interpreting the limited area of model or data. l We can get more accurate explanations.

6 KYOTO UNIVERSITY Background: Overview Model Specific Model Agnostic Global
interpretability • Regression coefficients • Feature importance … • Surrogate models • Sensitivity Analysis … Local interpretability • Maximum Activation Analysis … • LIME • LOCO • SHAP … Model agnostic : treating the original model as a black box.

interpretability • Regression coefficients • Feature importance … • Surrogate models • Sensitivity Analysis … Local interpretability • Maximum Activation Analysis … • LIME • LOCO • SHAP …

8 KYOTO UNIVERSITY Background: Model Specific & Global interpretability n
Sparse Linear Regression ! = #$ % & By doing Lasso regression, we know which features are important. n Feature importance l Random Forest (trees.feature_impotances_) l In model agnostic way, l Do ROC analysis for each explanatory variable.

9 KYOTO UNIVERSITY Background: Model Specific Model Agnostic Global interpretability
• Regression coefficients • Feature importance … • Surrogate models • Sensitivity Analysis … Local interpretability • Maximum Activation Analysis … • LIME • LOCO • SHAP …

10 KYOTO UNIVERSITY Background: Model Agnostic & Global Interpretability n
Surrogate models l Finding a simple model which can substitute the complex model. l Surrogate models are usually created by training a linear regression or decision tree on the original inputs and predictions of a complex model. n Sensitivity Analysis l Investigating whether model behavior and outputs remain stable when data is perturbed. S x = $f $x Simple model Original inputs Teaching Complex model

interpretability • Regression coefficients • Feature importance … • Surrogate models • Sensitivity Analysis … Local interpretability • Maximum Activation Analysis … • LIME • LOCO • SHAP …

12 KYOTO UNIVERSITY Background: Model specific & Local Interpretability n
Maximum Activation Analysis l Examples are found that maximally activate certain neurons, layers, or filters in a neural network or certain trees in decision tree ensembles.

13 KYOTO UNIVERSITY Background: Model Specific Model Agnostic Global interpretability
• Regression coefficients • Feature importance … • Surrogate models • Sensitivity Analysis … Local interpretability • Maximum Activation Analysis … • LIME • LOCO • SHAP …

14 KYOTO UNIVERSITY Today’s paper: paper of LIME n Title
: “Why Should I Trust You?” Explaining the Predictions of Any Classifier (KDD ‘16) n Authors: Marco Tulio Ribeiro University of Washington Sameer Singh University of Washington Carlos Guestrin University of Washington

15 KYOTO UNIVERSITY Back ground: “If the users do not
trust a model or a prediction, they will not use it.” n Determining trust in individual predictions is an important problem when the model is used for decision making. n By “explaining a prediction”, they mean presenting textual or visual artifacts that provide qualitative understanding of the relationship between the instance’s components (e.g. words in text, patches in an image) and the model’s prediction.

16 KYOTO UNIVERSITY Background : Desired characteristics for explainers n
They must be interpretable : they provide good understanding between the input variables and the response. n Another essential criterion is local fidelity : explainers must correspond to how the model behaves around the instances being explained. n Explainers should be model-agnostic : treating the original model as a black box.

17 KYOTO UNIVERSITY Proposed method

18 KYOTO UNIVERSITY Interpretable Data Representations: Features – Interpretable Representations
n ! ∈ ℝ$ : original representation of an instance being explained. l For text classification, ! is a feature like word embedding. l For image classification, ! is a tensor of with three color channels per pixel. n !′ ∈ {0, 1}$+ : a binary vector for its interpretable representation. l For text classification, !′ is a binary vector indicating the presence or absence of a word. l For image classification, !′ is a binary vector indicating the “presence” or “absence” of a contiguous patch of similar pixels (a super-pixel).

19 KYOTO UNIVERSITY Proposed method : KEY IDEA training the
explainer from the instances sampled around x. Instance being explained : x Explainer Original complex decision boundary Negative samples Positive samples

20 KYOTO UNIVERSITY Proposed method : KEY IDEA How to
sample around x? n They sample instances around x′ by drawing non-zero elements of x# uniformly at random. n Given a perturbed sample z# ∈ {0, 1}+, they recover the original representation z ∈ ℝ+ and obtain f z , which is used as a label for explanation model. Original representation: x Interpretable representation : x’ perturbed sample : z’ z Label of sample z : f(z) Drawing

21 KYOTO UNIVERSITY Proposed method: minimize local loss and model
complexity n ! : the instance being explained n $ : target model which we want to interpret. n % ∈ ' : an interpretable model such as linear models, decision trees. n () (+) : proximity measure between an instance + to !. n Ω(%) : a measure of complexity of the explanation g. n ℒ : locality loss

22 KYOTO UNIVERSITY Proposed method: Sparse Linear Explanations n Let
G be the class of linear models ; g z′ = w' ( z′ n Locality loss ℒ is weighted square loss: p *(,) : probability that z belongs to a certain class. p ./ (,) : sample weight that indicates the distance between x and z. n Sampling weight π1 (z) is defined as follows : π1 z = exp(− D x, z 8 σ8 )

23 KYOTO UNIVERSITY Example : Deep networks for images n
When we want to get explanations for image classifiers, we wish to highlight the super-pixels with positive weight toward a specific class. n Classifier : Google’s Inception neural networks The reason why Acoustic guitar was classified as Electric guitar

24 KYOTO UNIVERSITY Proposed method : Explaining for a whole
model n Explaining a single prediction is not enough to evaluate and assess trust in the model as a whole. They propose to give a global understanding of the model by explaining a set of individual instances. Selecting a good set of instances to explain the whole model. n Problem is p Given : a set of instances X p Output : B instances for users to inspect.

25 KYOTO UNIVERSITY Proposed method : SP-LIME choose instances to
cover as many important features as possible. the most important feature • c : coverage • !" : the global importance of feature j. • The goal of pick problem is to find a set of instances that maximizes coverage. • Due to submodularity, a greedy algorithm can be used to solve.

26 KYOTO UNIVERSITY Experiments

30 KYOTO UNIVERSITY Experiment : Evaluation with human subjects setting
n They evaluate LIME and SP-LIME in the following settings: 1. Can users select the best classifier? 2. Can non-experts improve a classifier? 3. Do explanations lead to insights? n Dataset : l Training : “Christianity” and “Atheism” documents (including not generalizing features such as authors and headers.) l Testing : 819 webpages about each class (real-world data.) n Human subjects are recruited on Amazon Mechanical Turk.

31 KYOTO UNIVERSITY Experiment : 1. Can users select the
best classifier? n Evaluating whether explanations can help users decide which classifier is better. n Two classifiers : l SVM trained on the original 20 newsgroups dataset l SVM trained on a “cleaned” dataset. (authors and headers are removed) n Users are asked to select which classifier will perform best in the real world. n Explanations are produced by greedy or LIME. n Instances are selected either by random or Submodular Pick.

32 KYOTO UNIVERSITY Experiment : 1. Can users select the
best classifier? n Result : Submodular pick greatly improves the user’s ability, with LIME outperforming greedy in both cases.

33 KYOTO UNIVERSITY Experiment : 2. Can non-experts improve a
classifier? n Explanations can aid in feature engineering by presenting the important features and removing not generalizing features. n Users are asked to identify which words from the explanations should be removed from subsequent training, for the worse classifier from the previous experiment. n Explanations are produced by SP-LIME or RP-LIME. Original classifier ……. ……. …….

34 KYOTO UNIVERSITY Experiment : 2. Can non-experts improve a
classifier? n Result : The crowd workers are able to improve the model by removing features. Further, SP-LIME outperforms RP-LIME.

35 KYOTO UNIVERSITY Experiment : 3. Do explanations lead to
insights? n Task of distinguish Wolves and Huskies. n Train a logistic regression classifier on a training set of 20 images, in which all pictures of wolves had snow in the background, while huskies did not. → this classifier predicts “Wolf” if there is snow and “Huskey” otherwise. n Experiment proceeds as follows: 1. They present a balanced set of 10 test predictions without explanations, where one wolf is not in a snowy back (and thus is predicted as “Husky”) and one husky is in a snowy back (and is predicted as “Wolf”). The other 8 examples are classified correctly. 2. They show the same images with explanations.

36 KYOTO UNIVERSITY Experiment : 3. Do explanations lead to
insights? Ask the subjects, 1. Do you trust this algorithm to work well in the real world? 2. Why? 3. How do you think the algorithm is able to distinguish between these photos of wolves and huskies? n Result : Explaining individual predictions is effective for getting insights into classifiers. People who is not correct People who is correct

37 KYOTO UNIVERSITY Conclusion

38 KYOTO UNIVERSITY n They proposed LIME, an approach to
faithfully explain the predictions of any model in an interpretable way. n They also introduced SP-LIME, a method to select representative predictions to give global view of the model. n Their experiments show that explanations are useful. n Other explanation model, such as decision trees. n Other domain, such as speech, video, medical domains, and recommendation systems. Conclusion and future works: They proposed a novel interpreting method.

Interpretability of Machine Learning : Paper re...

Interpretability of Machine Learning : Paper reading (LIME)

More Decks by Daiki Tanaka

Other Decks in Research

Featured

Transcript