Attacks on Machine Learning

About me • Security + Data science • Master’s student
at University of Tartu, Estonia

• What’s adversarial ML

• What’s adversarial ML • Goals of adversarial examples

• What’s adversarial ML • Goals of adversarial examples •
Algorithms to craft adversarial examples

• What’s adversarial ML • Goals of adversarial examples •
Algorithms to craft adversarial examples • Defense against adversarial examples(BONUS)

What’s Machine learning?

What’s Adversarial ML

What’s Adversarial ML =Security + ML

History Lesson!!

Where it started: • PRALab Unica

Where it started: • PRALab Unica Now: • Everyone First
paper: https://arxiv.org/pdf/1312.6199.pdf

Source: https://pralab.diee.unica.it/en/wild-patterns

Types of attacks • Whitebox • Blackbox

Ways to Attacks

Ways to Attacks Poisoning training data (train time attack)

Ways to Attacks Poisoning training data (Train time attack) Crafting
adversarial examples (Test time attack)

Adversarial examples goals • Confidence reduction: reduce the output confidence
classification

Adversarial examples goals • Confidence reduction • Misclassification: Changing the
output class

Adversarial examples goals • Confidence reduction • Misclassification • Targeted
misclassification: produce inputs that produce the output of a specific class

Adversarial examples goals • Confidence reduction • Misclassification • Targeted
misclassification • Source target misclassification: specific input gives specific output

How does Attacking ML models work?

How adversarial Examples work Source: cleverhans.io

Deep Neural Networks are Easily Fooled: High Confidence Predictions for
Unrecognizable Images :https://arxiv.org/pdf/1412.1897.pdf

Source: Adversarial Examples for Evaluating Reading Comprehension Systems Source: https://arxiv.org/pdf/1707.07328.pdf

Algorithms to craft adversarial examples for NN 1. FGSM: fast
gradient sign method

gradient sign method 2. JSMA: jacobian based saliency map attack

gradient sign method 2. JSMA: jacobian based saliency map attack 3. Carlini wagner attack

gradient sign method 2. JSMA: jacobian based saliency map attack 3. Carlini wagner attack 4. DeepFool

gradient sign method 2. JSMA: jacobian based saliency map attack 3. Carlini wagner attack 4. DeepFool 5. The Basic Iterative Method

gradient sign method 2. JSMA: jacobian based saliency map attack 3. Carlini wagner attack 4. DeepFool 5. The Basic Iterative Method 6. EAD: Elastic-Net Attacks

gradient sign method 2. JSMA: jacobian based saliency map attack 3. Carlini wagner attack 4. DeepFool 5. The Basic Iterative Method 6. EAD: Elastic-Net Attacks 7. Projected Gradient Descent Attack PS: these are only the famous one’s

BlackBox : How can it even be possible :( ?

Transferability The Space of Transferable Adversarial Examples: https://arxiv.org/pdf/1704.03453.pdf

Why should I Care?

But they aren’t that easy to make.. Are they :(

Then How to defend the Models against adversarial examples •
Adversarial Training ◦ Ensemble adversarial training • Defensive distillation

Libraries and resources • Cleverhans(Tensorflow) • FoolBox(bethgelab) • Secure ML
Library(not released) • Tools from PRA Lab • My blog

Thank You Q/A time <Don’t ask “WHY” because nobody knows>
Twitter: @prabhantsingh Linkedin: https://www.linkedin.com/in/prabhantsingh Github: @prabhant

Attacks on Machine Learning

Attacks on Machine Learning

More Decks by prabhant

Other Decks in Programming

Featured

Transcript