LIME & SHAP -機械学習モデルによる予測結果の説明性-

LIME & SHAP -機械学習モデルによる予測結果の説明性- 2018-05-19 @ PyCon mini Osaka 2018

⾃⼰紹介 Miyauchi Takashi (@hightensan) ⼤阪⼤学⼤学院 M2 Web Data Mining #Python
#Twitter #AWS #機械学習

もくじ #1 説明性(Interpretability) is 何? #2 LIME #3 SHAP #4
Tutorials on Jupyter #5 まとめ

#1 説明性(Interpretability) is 何?

## 背景機械学習モデルの複雑化・ブラックボックス化 – Deep Learning, アンサンブル学習 – モデル⾃⾝や出⼒の結果が⼈間にとって解釈困難予測はできたけど，信頼できるの？
– 何を根拠に判断を⾏っているのかよくわからん – “The problem is that a single metric, such as classification accuracy, is an incomplete description of most real-world tasks.”1 ※ https://arxiv.org/abs/1702.08608

## 背景機械学習モデルの説明性に関する要求の⾼まり – ICML 2017 Tutorial “Interpretable Machine Learning”
– Workshops ICML@2018, 2017, 2016 NIPS@2017, 2016 – 総務省 AI開発ガイドライン案・透明性の原則・アカウンタビリティの原則 ※ http://people.csail.mit.edu/beenkim/papers/BeenK_FinaleDV_ICML2017_tutorial.pdf 年説明性に関する論⽂の数 ※

## 説明性(Interpretability) Interpretability is the degree to which a human
can understand the cause of a decision.1 複雑な分類器の判断基準を，⼈間にも解釈可能(interpretable)になるように提⽰2 1. https://christophm.github.io/interpretable-ml-book/interpretability.html 2. https://www.slideshare.net/shima__shima/kdd2016

## 説明性(Interpretability) Interpretability is the degree to which a human
can understand the cause of a decision.1 複雑な分類器の判断基準を，⼈間にも解釈可能(interpretable)になるように提⽰2 1. https://christophm.github.io/interpretable-ml-book/interpretability.html 2. https://www.slideshare.net/shima__shima/kdd2016 ⼊⼒出⼒根拠を⽰す表現 (グラフなど)

## 説明性に関する⽅向性 # 局所的な説明特定の⼊⼒に対する予測の根拠を提⽰ # ⼤域的な説明複雑なモデルを可読性の⾼い解釈可能なモデルで表現 # 説明可能なモデルの設計
最初から可読性の⾼い解釈可能なモデルを設計 (# 深層学習モデルの説明) 深層学習モデル，特に画像認識モデルの説明 ※ https://www.ai-gakkai.or.jp/my-bookmark_vol33-no3/ 仕組みの理解挙動の理解

## 説明性に関する⽅向性 # 局所的な説明特定の⼊⼒に対する予測の根拠を提⽰ # ⼤域的な説明複雑なモデルを可読性の⾼い解釈可能なモデルで表現 # 説明可能なモデルの設計
最初から可読性の⾼い解釈可能なモデルを設計 (# 深層学習モデルの説明) 深層学習モデル，特に画像認識モデルの説明 ※ https://www.ai-gakkai.or.jp/my-bookmark_vol33-no3/ 仕組みの理解挙動の理解挙動の理解どんな⼊⼒をしたらどんな出⼒がでるかを説明 LIME と SHAP

#2 LIME

## LIME LIME(Local Interpretable Model-agnostic Explainations) – KDDʼ16 論⽂ “Why
should I trust You?:Explaining the predictions of any classifier” – 個別のデータに対する予測結果に対し，特徴量(テキストや画像の⼀部など)をハイライトして説明 ※ https://arxiv.org/pdf/1602.04938.pdf

## Examples テキスト分類(TF-IDF値による atheism/christian 分類) – Random Forest Classifier (with
500 trees) – Accuracy : 92.4% ヘッダ情報を有力な特徴と判断本当に信頼に足りうる？ ※ https://www.oreilly.com/learning/introduction-to-local-interpretable-model-agnostic-explanations-lime

## Examples 画像分類(学習済みInceptionによる分類) – わずかな確率ながらも，ビリヤード台・気球とも予測 ※ https://www.oreilly.com/learning/introduction-to-local-interpretable-model-agnostic-explanations-lime 画像のどの部分を判断の根拠としているかを可視化

## LIMEのアイデア説明したいデータの周辺からデータをサンプリング – 分類器の出⼒と近似するよう，局所的かつ単純なモデルを学習説明したいデータ説明用の分類器負ラベルのデータ正ラベルのデータ

## LIMEのアイデア元データの⼀部を変更した⼊⼒を複数作成し各々を予測 – テキストなら単語，画像なら部分マスクなど局所的かつ単純なモデルで各々のペアを学習→判断根拠出⼒ ※ https://www.oreilly.com/learning/introduction-to-local-interpretable-model-agnostic-explanations-lime

## 数学的背景説明したいデータに対して最適化問題を解く – :解釈可能なモデルの集合 – :説明したい分類器 – :の中のモデルで説明⽤の分類器 –
& :説明したいデータとサンプリングデータとの類似度 – ℒ:損失関数 = ∑ & , ( − ())0 – Ω():の複雑度 “データの周辺でのとの差分”+”の複雑さ”を最⼩にするが解 ※ https://arxiv.org/pdf/1602.04938.pdf

#3 SHAP

## SHAP SHAP(SHapley Additive exPlanations) – NIPSʼ17 論⽂ “A Unified
Approach to Interpreting Model Predictions” – LIMEを含むいくつかの局所的な説明法をゲーム理論の枠組みのもとで統⼀的に記述説明モデルが満たすべき性質を定式化，指標化 → SHAP Value ※ https://github.com/slundberg/shap

## Examples 住宅価格の予測(xgboostによる回帰) – 各特徴量がどの程度出⼒に寄与しているかを可視化 ※Feature importanceはモデルに対する説明 (各データに対してではない) →重要度の出⼒結果は
必ずしも⼀致しない

## Examples 画像分類(学習済みVGG16による分類) – strawberryを⽰す領域がSHAP valueが⾼い Granny_Smith(りんご)，fig(イチジク)も予測しているものの SHAP valueは低い ※
https://github.com/slundberg/shap

## SHAPのアイデア説明モデルを統⼀的に定義 – :説明したい複雑な分類器 – :説明⽤のシンプルな分類器 – :ある1つの⼊⼒データ –
′:単純化した⼊⼒データ → = ℎ&(′) 説明モデル ′ ≈ 8のときに， g(8) ≈ (ℎ& (′))となるように学習

′:単純化した⼊⼒データ → = ℎ&(′) Additive feature attribution methods 説明モデル ′ ≈ 8のときに， g(8) ≈ (ℎ& (′))となるように学習説明⽤のシンプルなモデルは単純化した⼊⼒数個の貢献の加法で表現可能

′:単純化した⼊⼒データ → = ℎ&(′) Additive feature attribution methods 説明モデル ′ ≈ 8のときに， (8) ≈ ( (′))となるように学習説明⽤のシンプルなモデルは単純化した⼊⼒数個の貢献の加法で表現可能 LIMEも含む

## 数学的背景説明モデルを統⼀的に定義 – Additive feature attribution methods 理想的な説明モデルが満たすべき性質を定式化(後述) 上の性質を満たす唯⼀の特徴配分⼿法(説明モデル)は
協⼒ゲーム理論で解ける！(らしい...)

## 数学的背景説明モデルが満たすべき性質 #1 Local accuracy – ある⼊⼒データとその予測()と，単純化した⼊⼒データ′に対して局所的に近似した 8
元の出⼒()と同じになるべきという条件

## 数学的背景説明モデルが満たすべき性質 #2 Missingness – 出⼒を変えないような特徴は識別に貢献していないという条件

## 数学的背景説明モデルが満たすべき性質 #3 Consistency – の⽅がよりも特徴= の有無で出⼒が⼤きく影響を受けるなら，の⽅がよりも特徴の貢献が⼤きくなるべきという条件
⇓ ? @ A @B=

## 数学的背景説明モデルが満たすべき性質 Local accuracy・Missingness・Consistency を満たす唯⼀の特徴配分⼿法(説明モデル)は協⼒ゲーム理論で解けるあらゆる組み合わせで特徴があるとき・特徴がないときの差分を計算 SHAPでは近似

#4 Tutorials on Jupyter

#5 まとめ

## まとめ機械学習モデルの説明性を担保：LIME と SHAP – 個々の予測の判断根拠となった特徴などを可視化 – エラー分析，継続的なメンテナンスのハードルを下げる⼈間による最終的な意思決定をフォロー
– 精度だけでなく，説明を提⽰することで⼈と機械学習モデルの連携を可能に

## References LIME : 著者実装 https://github.com/marcotcr/lime SHAP : 著者実装 https://github.com/slundberg/shap
KDD2016勉強会資料 https://www.slideshare.net/shima__shima/kdd2016 機械学習における解釈性(Interpretability in Machine Learning) https://www.ai-gakkai.or.jp/my-bookmark_vol33-no3/ ディープラーニングの判断根拠を理解する⼿法 https://qiita.com/icoxfog417/items/8689f943fd1225e24358 ICML 2017 Tutorial http://people.csail.mit.edu/beenkim/papers/BeenK_FinaleDV_ICML2017_tutorial.pdf Interpretable Machine Learning https://christophm.github.io/interpretable-ml-book/

Questions? @hightensan

LIME & SHAP -機械学習モデルによる予測結果の説明性-

LIME & SHAP -機械学習モデルによる予測結果の説明性-

hightensan

More Decks by hightensan

Featured

Transcript