Attacks on Machine Learning

Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

About me ● Security + Data science ● Master’s student at University of Tartu, Estonia

Slide 3

Slide 3 text

● What’s adversarial ML

Slide 4

Slide 4 text

● What’s adversarial ML ● Goals of adversarial examples

Slide 5

Slide 5 text

● What’s adversarial ML ● Goals of adversarial examples ● Algorithms to craft adversarial examples

Slide 6

Slide 6 text

● What’s adversarial ML ● Goals of adversarial examples ● Algorithms to craft adversarial examples ● Defense against adversarial examples(BONUS)

Slide 7

Slide 7 text

What’s Machine learning?

Slide 8

Slide 8 text

What’s Adversarial ML

Slide 9

Slide 9 text

What’s Adversarial ML =Security + ML

Slide 10

Slide 10 text

History Lesson!!

Slide 11

Slide 11 text

Where it started: ● PRALab Unica

Slide 12

Slide 12 text

Where it started: ● PRALab Unica Now: ● Everyone First paper: https://arxiv.org/pdf/1312.6199.pdf

Slide 13

Slide 13 text

Source: https://pralab.diee.unica.it/en/wild-patterns

Slide 14

Slide 14 text

Types of attacks ● Whitebox ● Blackbox

Slide 15

Slide 15 text

Ways to Attacks

Slide 16

Slide 16 text

Ways to Attacks Poisoning training data (train time attack)

Slide 17

Slide 17 text

Ways to Attacks Poisoning training data (Train time attack) Crafting adversarial examples (Test time attack)

Slide 18

Slide 18 text

Adversarial examples goals ● Confidence reduction: reduce the output confidence classification

Slide 19

Slide 19 text

Adversarial examples goals ● Confidence reduction ● Misclassification: Changing the output class

Slide 20

Slide 20 text

Adversarial examples goals ● Confidence reduction ● Misclassification ● Targeted misclassification: produce inputs that produce the output of a specific class

Slide 21

Slide 21 text

Adversarial examples goals ● Confidence reduction ● Misclassification ● Targeted misclassification ● Source target misclassification: specific input gives specific output

Slide 22

Slide 22 text

How does Attacking ML models work?

Slide 23

Slide 23 text

How adversarial Examples work Source: cleverhans.io

Slide 24

Slide 24 text

Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images :https://arxiv.org/pdf/1412.1897.pdf

Slide 25

Slide 25 text

Source: Adversarial Examples for Evaluating Reading Comprehension Systems Source: https://arxiv.org/pdf/1707.07328.pdf

Slide 26

Slide 26 text

Algorithms to craft adversarial examples for NN 1. FGSM: fast gradient sign method

Slide 27

Slide 27 text

Algorithms to craft adversarial examples for NN 1. FGSM: fast gradient sign method 2. JSMA: jacobian based saliency map attack

Slide 28

Slide 28 text

Algorithms to craft adversarial examples for NN 1. FGSM: fast gradient sign method 2. JSMA: jacobian based saliency map attack 3. Carlini wagner attack

Slide 29

Slide 29 text

Algorithms to craft adversarial examples for NN 1. FGSM: fast gradient sign method 2. JSMA: jacobian based saliency map attack 3. Carlini wagner attack 4. DeepFool

Slide 30

Slide 30 text

Algorithms to craft adversarial examples for NN 1. FGSM: fast gradient sign method 2. JSMA: jacobian based saliency map attack 3. Carlini wagner attack 4. DeepFool 5. The Basic Iterative Method

Slide 31

Slide 31 text

Algorithms to craft adversarial examples for NN 1. FGSM: fast gradient sign method 2. JSMA: jacobian based saliency map attack 3. Carlini wagner attack 4. DeepFool 5. The Basic Iterative Method 6. EAD: Elastic-Net Attacks

Slide 32

Slide 32 text

Slide 33

Slide 33 text

BlackBox : How can it even be possible :( ?

Slide 34

Slide 34 text

Transferability The Space of Transferable Adversarial Examples: https://arxiv.org/pdf/1704.03453.pdf

Slide 35

Slide 35 text

Why should I Care?

Slide 36

Slide 36 text

Why should I Care?

Slide 37

Slide 37 text

Why should I Care?

Slide 38

Slide 38 text

Why should I Care?

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

But they aren’t that easy to make.. Are they :(

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

Then How to defend the Models against adversarial examples ● Adversarial Training ○ Ensemble adversarial training ● Defensive distillation

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

Libraries and resources ● Cleverhans(Tensorflow) ● FoolBox(bethgelab) ● Secure ML Library(not released) ● Tools from PRA Lab ● My blog

Slide 45

Slide 45 text

Thank You Q/A time Twitter: @prabhantsingh Linkedin: https://www.linkedin.com/in/prabhantsingh Github: @prabhant