DaST

 DaST

授業課題で論文解説・実装を行いました。

8e03144498dca8ec32a17449b631fa00?s=128

Kaede Shiohara

August 05, 2020
Tweet

Transcript

  1. DaST: Data-free Substitute Training for Adversarial Attack [CVPR2020] M1, Kaede

    Shiohara Mingyi Zhou, Jing Wu, Yipeng Liu, Shuaicheng Liu, Ce Zhu University of Electronic Science and Technology of China Megvii Technology 1
  2. Contents • Explanation part • Main contribution • Traditional Adversarial

    Attack methods • Idea • Attack Scenario • Adversarial Generator-Classifier Training • Experiments • Visualizations • Re-implementation part • Model Architecture • Experiment on MNIST 2
  3. Explanation part 3

  4. Main contribution (Why this paper is accepted) •The first to

    train substitute model without real training data in two attack scenario. 4
  5. Traditional Adversarial Attack methods • Gradient-based (e.g. FGSM[1]) ✓Need pretrained

    model which imitates target model -> Need real training data (That is very difficult in real problems!) • Score-based, Decision-based (e.g. ZOO[2]) ✓Need many query on test ✓Not need substitute model [1]Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. Inter- national Conference on Learning Representations (ICLR), 2015 [2] Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. Zoo: Zeroth order optimization based black- box attacks to deep neural networks without training sub- stitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pages 15‒26. ACM, 2017 5
  6. Traditional Adversarial Attack methods • Gradient-based (e.g. FGSM[1]) ✓Need pretrained

    model which imitates target model -> Need real training data (That is very difficult in real problems!) • Score-based, Decision-based (e.g. ZOO[2]) ✓Need many query on test ✓Not need substitute model • DaST(proposed mothod) : Not attack method • Train substitute model without real training data -> useful when we need substitute model as Gradient-based attack methods [1]Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. Inter- national Conference on Learning Representations (ICLR), 2015 [2] Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. Zoo: Zeroth order optimization based black- box attacks to deep neural networks without training sub- stitute models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pages 15‒26. ACM, 2017 6 Give a Solution
  7. ・Naïve Gradient-based Attack online local Target Substitute Loss func online

    local Target Substitute Loss func ・Gradient-based Attack with DaST It is difficult for attacker to collect same training datasets as target model used lol. Trained by backpropagation Trained by backpropagation No need to collect training datasets because DaST generates them. T T D D G 7
  8. Idea Use image generator(G ) for training substitute model(D )

    • Objective of D • Imitate attacked model(T ) • Objective of G • Generate new samples with the given label n ( ) • Generate new samples that maximizes distance between D and T ( ) (CE: cross entropy) ( is more stable on training than ) to increase diversity of generated samples 8
  9. Idea Use image generator(G ) for training substitute model(D )

    • Objective of D • Imitate attacked model(T ) • Objective of G • Generate new samples with the given label n ( ) • Generate new samples that maximizes distance between D and T ( ) Canʼt access T ʼs grad In training progresses, D≒T (CE: cross entropy) to increase diversity of generated samples 9
  10. Attack scenario • Label-only • Attackers can probe the output

    hard-label of the attacked model • Probability-only • Attackers can probe the output probability of the attacked model prob label prob prob (e.g. [0, 0, …, 0, 1, 0, …, 0]) (e.g. [0.03, 0.1, …, 0.05, 0.7, 0.01, …, 0.04]) 10
  11. Adversarial Generator-Classifier Training N: # of classes 11

  12. Adversarial Generator-Classifier Training Li generates samples with label i N:

    # of classes 12
  13. Adversarial Generator-Classifier Training Conv layers are shared by all Li

    N: # of classes 13
  14. Adversarial Generator-Classifier Training ( ) 14

  15. Experiment on MNIST Substitute model type Pretrained: train with same

    dataset as attacked model used DaST-P: probability-only scenario DaST-L: label-only scenario Attack method zzzzzzzzz Attack successful rate Attack type (%) 15
  16. Experiment on MNIST DaST-P > Pretrained (> DaST-L) (DaST-P >)

    DaST-L > Pretrained • Attacked : 4 convs net • Substitute : 5 convs net 16 Surprisingly, Attack Successful Rate of DaST is higher than one of Pretrained.
  17. Experiment on MNIST • Attacked : 4 convs net •

    Substitute S/M/L : 3/4/5 convs net Large > Small ≧ Medium 17
  18. Experiment on CIFAR-10 DaST-P > Pretrained (> DaST-L) (DaST-P >)

    DaST-L > Pretrained • Attacked : VGG16 • Substitute : ResNet50 18
  19. Experiment on CIFAR-10 VGG13 > ResNet50 > ResNet18 • Attacked

    : VGG16 • Substitute : VGG16/ResNet18/ResNet50 Small model is better unlike in MNIST 19
  20. Visualization 20

  21. Experiment on Microsoft Azure (online model) DaST-L > DaST-P >

    Pretrained • Attacked : unknown • Substitute : 5 convs net The low attack successful rate of ʻpretrainedʼ implies that unknown model is very different from substitute model. 21
  22. Visualization DaST generates ʻsingularʼ images because of first term e-d(T(X),D(X))

    of LG 22
  23. Re-implementation part 23

  24. Model Architecture 24

  25. Model Architecture ・論⽂に層の数やパラメータなどの記載なし 25

  26. Model Architecture ・論⽂に層の数やパラメータなどの記載なし 26

  27. Model Architecture ・論⽂には ”3(,4,5) convolutional layers“ としか記載がない 27

  28. Training α=0.2 ・Dataset : MNIST ・Scenario : Non-Targeted, Probability-only/Label-only ・Optimizer

    : Adam(lr=0.0001) (論⽂に記載なし) ・# of samples : 記載がなかったので⼗分に繰り返しを⾏った ・Attack method : FGSM 以下の設定で再現実験を⾏った 28
  29. Result (Prob-only) 再現実験ではLG やLD は下がったが、MNISTに対するAccuracyが論⽂ほど上がらなかった。 -> その結果、Target modelに対する有効なAdversarial Examplesが⽣成できなかった(ASR が論⽂ほど上がらなかった)

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 253 267 281 295 309 323 337 351 365 379 393 407 421 435 449 463 477 491 Epoch Acc_mnist Acc_synth ASR 0 0.2 0.4 0.6 0.8 1 1 20 39 58 77 96 115 134 153 172 191 210 229 248 267 286 305 324 343 362 381 400 419 438 457 476 495 Epoch LC 0 0.001 0.002 0.003 0.004 0.005 1 21 41 61 81 101 121 141 161 181 201 221 241 261 281 301 321 341 361 381 401 421 441 461 481 Epoch LD = 29
  30. Result (Prob-only) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

    0.8 0.9 1 1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 253 267 281 295 309 323 337 351 365 379 393 407 421 435 449 463 477 491 Epoch Acc_mnist Acc_synth ASR 0 0.2 0.4 0.6 0.8 1 1 20 39 58 77 96 115 134 153 172 191 210 229 248 267 286 305 324 343 362 381 400 419 438 457 476 495 Epoch LC 0 0.001 0.002 0.003 0.004 0.005 1 21 41 61 81 101 121 141 161 181 201 221 241 261 281 301 321 341 361 381 401 421 441 461 481 Epoch LD = 30 Acc_synth: ⽣成された画像に対する代替モデルの精度 Acc_mnist: MNISTのテストセット(10000サンプル) に対する代替モデルの精度 ASR: 代替モデルでのAttack Successful Rate 学習が不安定 精度が頭打ちになった 損失がすぐに頭打ちになった 再現実験ではLG やLD は下がったが、MNISTに対するAccuracyが論⽂ほど上がらなかった。 -> その結果、Target modelに対する有効なAdversarial Examplesが⽣成できなかった(ASR が論⽂ほど上がらなかった)
  31. 0 0.5 1 1.5 2 1 20 39 58 77

    96 115 134 153 172 191 210 229 248 267 286 305 324 343 362 381 400 419 438 457 476 495 Epoch LD Result (Label-only) Label-onlyの場合でも、Probability-onlyの場合のようにMNISTのテストセットに対する代替モデル の精度が0.4程度で学習が進まなくなってしまった。 0 0.2 0.4 0.6 0.8 1 1 20 39 58 77 96 115 134 153 172 191 210 229 248 267 286 305 324 343 362 381 400 419 438 457 476 495 Epoch LC = 31 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 15 29 43 57 71 85 99 113 127 141 155 169 183 197 211 225 239 253 267 281 295 309 323 337 351 365 379 393 407 421 435 449 463 477 491 Epoch Acc_mnist Acc_synth ASR
  32. Result 論⽂に記載されている値を再現できなかった原因として以下が考えられる • 学習の難しさ • 本⼿法では通常のGANの学習のようにGとDのミニマックスゲームになっている (p.8,9 参照) 実際に、前ページで⽰したように学習が不安定であった。 •

    実データ(MNIST)は代替モデルからは invisibleであり、精度が保証されない 前ページの実験では提案されているLD やLG がきちんと下がっているにも関わらず、MNISTに対 する精度は上がらなかった。 以上より、モデルのハイパパラメータや学習⽅法の詳細が省略されている 原論⽂の情報だけでは実験結果が再現できない可能性がある。 ( ) 32 (※再現実験に使⽤したパラメータはいくつか⾏った実験のうち最良のものを載せている) Code URL: https://github.com/mapooon/DaST_reimplement