Neural Trojan mini review

Neural Trojans mini review 2020/01/12 @IIJ – 第二回サイバーセキュリティ系LT会 in 東京
Shuntaro OHNO

About Me Shuntaro OHNO • Twitter: @doraneko_b1f • GitHub: @doraneko94
• Website: https://ushitora.net ✓ Neuro-Scientist : Ph.D student in Toyama Univ. ✓ Memory, Learning, Artificial Intelligence ✓ Data science in Python & Neuro-Simulation in Rust ➢ 今回は、人工知能を洗脳する方法と、その防御手法について話します。

What is “Neural Trojan”? “We define the malicious hidden functionalities
incorporated in neural IPs by the IP vendor as Neural Trojans” [Liu et al. 2017] IP: Intellectual Property [Chou et al. 2018]

Liu et al. “Neural Trojans” permitted Not permitted 顧客が想定したデータ攻撃者が用いた
訓練データ

Gu et al. “BadNets: Identifying Vulnerabilities in the Machine Learning
Model Supply Chain”

Gu et al. “BadNets: Identifying Vulnerabilities in the Machine Learning
Model Supply Chain” 最終conv層のactivity（オリジナル）最終conv層のactivity（転移学習後） Adversaryモデルをもとに、別の画像認識課題のために転移学習（最終全結合層のみ再学習）

Clements et al. “Hardware Trojan Attacks on Neural Networks”

Clements et al. “Hardware Trojan Attacks on Neural Networks” Triggerによって、
適用する関数を変化させる

Zou et al. “PoTrojan: powerful neuron-level trojan designs in deep
learning models” T：特定のTriggerパターンが入力されたときのみ動作

Li et al. “Hu-Fu: Hardware and Software Collaborative Attack Framework
against Neural Networks” Wact +Winact ：正常に動作 Wact only：有害な結果 Winact ： Triggerにより停止する（出力が０になる）

Others • Dai et al. “A backdoor attack against LSTM-based
text classification systems” ✓ LSTMにNeural Trojanを仕込む • Kiourti et al. “TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents” ✓ 強化学習モデルにNeural Trojanを仕込む

Defense

Liu et al. “Neural Trojans” permitted Not permitted 顧客が想定したデータ攻撃者が用いた
訓練データ（再掲）

Liu et al. “Neural Trojans” 1. Input Anomaly Detection ➢
SVM, Decision Tree ➢ 99.8% trigger detection, with 12.2% false positive 2. Re-Training ➢ 94.1% trigger detection ➢ IP should be reconfigurable 3. Input Processing ➢ 90.2% trigger detection

Liu et al. “Neural Trojans” 3. Input Processing Auto Encoder
DNN (Trojan?) 顧客が、自身の保有しているデータで Auto Encoderを訓練訓練した画像の形状は保たれるが、訓練していない画像（Trigger）は全く別のものになる → 不発

Chou et al. “SentiNet: Detecting Physical Attacks Against Deep Learning
Systems” Grad-CAM（判断根拠可視化）で、DNNがどこを見ているか調べる。結果に大きく影響しているパーツを特定し、それを他の画像に付けとき、結果を変えられるか？変えられる → Trigger

Chou et al. “SentiNet: Detecting Physical Attacks Against Deep Learning
Systems” クラス改変成功率 Control の確信度 Trigger Safe Control：パーツの位置を隠した画像 Control の確信度が低い →Triggerの影響というより、重要な部分が隠れたことが問題

Conclusion Neural Trojan は、こわい。

Advertisement ✓ 総務省主催の、地理空間情報ハッカソン ✓ 地理空間情報の活用法を学び、２日でサービス開発を行います ✓ 参加登録は connpass から！
➢ 愛知会場： 2020年02月01日（土）～2020年02月02日（日） ✓ モビリティについての課題解決 ➢ 富山会場： 2020年02月08日（土）～2020年02月09日（日） ✓ 地理空間情報を用いたゲーム開発（Unity） ➢ 東京会場： 2020年02月15日（土）～2020年02月16日（日） ✓ 防災についての課題解決 ➢ 沖縄会場： 2020年02月22日（土）～2020年02月23日（日） ✓ モビリティ・リゾテック等についての課題解決

Neural Trojan mini review

Neural Trojan mini review

Shuntaro Ohno

More Decks by Shuntaro Ohno

Other Decks in Technology

Featured

Transcript

Neural Trojans mini review 2020/01/12 @IIJ – 第二回サイバーセキュリティ系LT会 in 東京

About Me Shuntaro OHNO • Twitter: @doraneko_b1f • GitHub: @doraneko94

What is “Neural Trojan”? “We define the malicious hidden functionalities

Liu et al. “Neural Trojans” permitted Not permitted 顧客が想定したデータ攻撃者が用いた

Gu et al. “BadNets: Identifying Vulnerabilities in the Machine Learning

Gu et al. “BadNets: Identifying Vulnerabilities in the Machine Learning

Clements et al. “Hardware Trojan Attacks on Neural Networks”

Clements et al. “Hardware Trojan Attacks on Neural Networks” Triggerによって、

Zou et al. “PoTrojan: powerful neuron-level trojan designs in deep

Li et al. “Hu-Fu: Hardware and Software Collaborative Attack Framework

Others • Dai et al. “A backdoor attack against LSTM-based

Defense

Liu et al. “Neural Trojans” permitted Not permitted 顧客が想定したデータ攻撃者が用いた

Liu et al. “Neural Trojans” 1. Input Anomaly Detection ➢

Liu et al. “Neural Trojans” 3. Input Processing Auto Encoder

Chou et al. “SentiNet: Detecting Physical Attacks Against Deep Learning

Chou et al. “SentiNet: Detecting Physical Attacks Against Deep Learning

Conclusion Neural Trojan は、こわい。

Advertisement ✓ 総務省主催の、地理空間情報ハッカソン ✓ 地理空間情報の活用法を学び、２日でサービス開発を行います ✓ 参加登録は connpass から！