Dévoiler les secrets d’un modèle de machine learning : une menace crédible ?

Uncovering the secrets of a machine learning model: a credible
threat?

Giulia BIANCHI Data Scientist @PubSapientEng @Giuliabianchl Johan JUBLANC Data Scientist
@PubSapientEng JJublanc 2

SOMMAIRE Model attacks Differential privacy A model inversion attack: the
secret revealer Are attacks credible and avoidable? 3

Model attacks 4

What do you mean: “attacking a model” ? 5

A lot of ways to attack a model 1. Evasion
attacks 2. Poisoning attacks 3. Privacy attacks Classifier Toaster Classifier Classifier Classifier Toaster 6

Privacy attacks Attacks on training data Model extraction Membership inference
attacks Inversion attacks ? 7 7

White box vs Black box accessibility Inputs Outputs Securmax Inputs
Outputs Securmax • model type • model architecture • parameters’ values 8 8

HELLO DATAMONS! Streamèche Realticèle DS-Li FastFeu Prototys SciProduce Securti Securalto
SciProdaffe Buildy ML-Li AI-Li Productor SciProdank Securmax 9

Differential privacy 10

When DP is useful The Secret Sharer: Evaluating and Testing
Unintended Memorisation in Neural Networks • Black-box attack on sequence generative NN • They introduced a social security number in training data (the secret) and were able to retrieve it at prediction time • Differential privacy solved the problem Protection against unintended memorisation https://xkcd.com/2169/ 11

What DP does input = n entries Model output input
= n-1 entries Model output Training a model 12

What DP does input = n entries DP Model output
input = n-1 entries DP Model output Training a Differentially Private (DP) model 13

What DP is • Differential privacy addresses the paradox of
learning nothing about an individual while learning useful information about a population • Roughly, an algorithm is differentially private if an observer seeing its output cannot tell if a particular individual's information was used in the computation • Differential privacy will provide privacy by process; in particular it will introduce randomness. The (mathematical) theory 14 14

From theory to practice Differential privacy is introduced in deep
learning algorithms by adding gaussian noise the stochastic gradient descent Differentially private stochastic gradient descent (sgd) DP Model 15 15

How to use DP TensorFlow Privacy implements the differentially private
versions of common optimizers • sgd → DPGradientDescentGaussianOptimizer • adam → DPAdamGaussianOptimizer • adagrad → DPAdagradGaussianOptimizer • RMSProp → DPRMSPropGaussianOptimizer Differentially private sgd implementation 16 16

data = # read data model = # instantiate model
architecture if train_with_differential_privacy == True: optimizer = DPGradientDescentGaussianOptimizer( l2_norm_clip=l2_norm_clip, noise_multiplier=noise_multiplier, num_microbatches=microbatches, learning_rate=learning_rate) # Compute vector of per-example loss rather than its mean over a minibatch. loss = tf.keras.losses.CategoricalCrossentropy( from_logits=True, reduction=tf.losses.Reduction.NONE) else: # train without differential privacy optimizer = GradientDescentOptimizer(learning_rate=learning_rate) loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True) model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy']) How to use DP privacy.optimizers.dp_optimizer. DPGradientDescentGaussianOptimizer tf.optimizers.SGD 17

Without DP With DP How to use DP data =
# read data model = # instantiate model architecture if train_with_differential_privacy == True: optimizer = DPGradientDescentGaussianOptimizer( l2_norm_clip=l2_norm_clip, noise_multiplier=noise_multiplier, num_microbatches=microbatches, learning_rate=learning_rate) # Compute vector of per-example loss rather than its mean over a minibatch. loss = tf.keras.losses.CategoricalCrossentropy( from_logits=True, reduction=tf.losses.Reduction.NONE) else: # train without differential privacy optimizer = GradientDescentOptimizer(learning_rate=learning_rate) loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True) model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy']) 18

Differential privacy vs Model Performances Does adding noise degrade performances?
19

Privacy analysis input = n entries DP Model output input
= n-1 entries DP Model output How to measure the achieved level of protection: epsilon The "difference" is at most 20

Privacy analysis Differential privacy is measurable according to its mathematical
deﬁnition and is expressed by two parameters • epsilon () upper bound on how much the probability of a particular model output can vary by adding or removing a single training point. Also called privacy budget • delta () bounds the probability of our privacy guarantee not holding How to measure the achieved level of protection 21 21

Privacy analysis Evolution of epsilon, accuracy and training time with
noise • Epsilon () depends on noise_multiplier. The greater the noise, the smaller epsilon, the stronger guarantee • A good value for delta () is << 1/n (n=input shape) MNIST classiﬁcation with 2 convolutional + pooling layers, trained for 15 epochs, TensorFlow v. 1.15 TensorFlow Privacy v. 0.2.2, 8 vCPUs, 30 GB RAM, 1 GPU NVIDIA Tesla K80. Everything else the same but with classical SGD, training took less than 40 seconds. Model architecture and hyperparameters from ofﬁcial tutorial. 22

A model inversion attack 23

The secret revealer: context THE SECRET REVEALER: GENERATIVE MODEL-INVERSION ATTACKS
AGAINST DEEP NEURAL NETWORKS, 2020 24

Target model: a portrait classiﬁer [0,1%] Scyprodank [0,2%] Securmax [0,2%]
SciProduce [0,2%] ML-Li [99,1%] Streamèche [0,3%] Buildy The goal of the model is to recognize a person given a person's portrait Target model 25 25

Model inversion attack Scyprodank Securmax SciProduce ML-Li Streamèche Buildy The
goal of the attack is to reconstruct a portrait given a person's name Target model 26 26

Needed information to attack Private data Public Data Information used
by the adversary Sensitive data under attack Target model Training Datamons’ names Scyprodank Securmax SciProduce ML-Li Streamèche Buildy The original portraits are never accessible, but the objective is to reconstruct them 27

The secret revealer: attack 28

The attack structure 1. Learn to generate credible portraits 2.
Estimate the difference between generated portraits and the portrait of a targeted person 3. Optimise portrait generation Three big steps 29 29

1 Generate credible portraits Generative adversarial neural network (GAN) GENERATOR
Random latent vector Attempt to create a credible portraits True portraits Random sampling CRITIC ❌ ✔ 30

2.1 Retrieve target output vector 0 0 0 0 1
0 Scyprodank Securmax SciProduce ML-Li Streamèche Buildy The adversary wants to reproduce the portrait of Streamèche 31

2.2 Compute proximity between reconstructed prediction and target vector 2
Target model Size m 32

3 Optimize portrait generation GENERATOR LOSS Weights of every model
(Generator, Critic and Target) are frozen and used to perform the gradient descent on the latent vector ❄ Target model proximity ❄ CRITIC ❌✔ credibility ❄ 33

Does it work? 34

Evaluate performances A global model to make a fair evaluation
Public Data Evaluation model Private data Training 35 35

Evaluate performances Target model Attack accuracy VGG16 28% ResNet 152
44% face.evolve 46% Evaluation model Streamèche ? Possible only when doing researches (not in real situation) 36 36

Examples of reconstructions from scratch Dataset CelebA 37 37

• Blurred portrait • Square mask • T-shaped mask over
mouth and eyes Use auxiliary information to improve results 38 38

Use auxiliary information to improve results GENERATOR Random latent vector
Attempt to create a credible portraits Aux data 39 39

Target model Attack accuracy Blurred aux. info. Attack accuracy Square
mask aux. info. Attack accuracy T mask aux. info. VGG16 43% 78% 58% ResNet 152 50% 80% 63% face.evolve 51% 82% 64% Use auxiliary information to improve results 40

Example of reconstitution with auxiliary information Aux. info Original Reconst
41

Example of reconstitution with auxiliary information Aux. info Original Reconst
Aux. info Original Reconst 42

Our Experience ❌ Got the same results ✔ Built an
attack on MNIST dataset ✔ Retrieved few credible digits Not enough details in the paper ✔ To sum up Build an attack is costly... We recently got the code Surprise it is in Pytorch… we started in Tensorﬂow BUT NOT IMPOSSIBLE In time In skills In money 43 1 4

Are attacks credible and avoidable? 44

A success depending on restrictive conditions but not impossible to
gather 1. A model trained on sensitive data exists 2. Full white box access to this model 3. Not too expensive attack compared to expected proﬁt 45 45

1. Models trained on sensitive data • Facial recognition already
developed and deployed in several countries • Experiments have already been done in France • A market for software and services of up to $ 7 billion by 2024 worldwide* An example : facial recognition spreading [*] Facial Recognition Market by Component (Software Tools (2D Recognition, 3D Recognition, and Facial Analytics) and Services), Application Area (Emotion Recognition, Access Control, and Law Enforcement), Vertical, and Region - Global Forecast to 2024 46

2. Full white box access to the target model •
Several models are in full access, for instance those used to make transfer learning • Models are generally not perceived as sensitive data • Some attacks are designed to retrieve features of models (model extraction attacks) 47 47

3. Not too expensive attack compared to expected proﬁt •
Build up an attack takes time and is not always successful • Criminal organisations might have great interests in retrieving people identities 48

Some plausible use-cases? • A witness who changed identity and
which portrait has been anonymised • A website that offers to broadcast anonymous videos Securmax is the one that finished the cake Welcome to Jojo’s channel 49

Good practices • Prevent white box access to models for
random users • For models used through API, limit the number of possible calls to the model for a given period of time (for instance 10 times a day) • Consider model as sensitive as the data on which they are trained • Use differential privacy when training models (mainly for membership attacks) 50 50

Conclusion • Probably not an imminent and critical risk •
But it is still a good idea to adopt good practices to prevent future possible security breaches • We hope you have few new tools to judge by yourself if your model needs to be protected or not 51 51

To go further • Demystifying Membership Inference Attacks in Machine
Learning as a Service, Stacey Truex, Ling Liu, Mehmet Emre Gursoy, Lei Yu, and Wenqi Wei, 2019 • THE SECRET REVEALER: GENERATIVE MODEL-INVERSION ATTACKS AGAINST DEEP NEURAL NETWORKS, Yuheng Zhang, Ruoxi Jia, Hengzhi Pei, Wenxiao Wang, Bo Li, Dawn Song, 2020 • Privacy attacks on Machine Learning, Medium, Ilja Moisejevs, 2019 • The Algorithmic Foundations of Differential Privacy • Deep Learning with Differential Privacy • TF Privacy tutorial • The Secret Sharer: Evaluating and Testing Unintended Memorisation in Neural Networks • https://www.cnil.fr/sites/default/ﬁles/atoms/ﬁles/reconnaissance_faciale.pdf • https://videos.senat.fr/video.1024595_5c5aeeb301759.audition-de-mme-valerie-pecresse-presidente-de-la-region-ile-de-france-sur-le-grand-paris-et- sur-l?timecode=4712000 • https://www.lemonde.fr/pixels/article/2019/08/28/reconnaissance-faciale-la-cnil-tique-sur-le-bilan-de-l-experience-nicoise_5503769_4408996.html • https://www.franceculture.fr/emissions/la-bulle-economique/reconnaissance-faciale-quand-les-industriels-poussent-a-son-developpement • https://www.arte.tv/fr/videos/083310-000-A/tous-surveilles-7-milliards-de-suspects/ 52

CONTACT Giulia Bianchi Johan Jublanc [email protected] [email protected] MERCI 53

Dévoiler les secrets d’un modèle de machine lea...

Dévoiler les secrets d’un modèle de machine learning : une menace crédible ?

More Decks by Giulia

Other Decks in Programming

Featured

Transcript