Save 37% off PRO during our Black Friday Sale! »

Dévoiler les secrets d’un modèle de machine learning : une menace crédible ?

Dévoiler les secrets d’un modèle de machine learning : une menace crédible ?

Il existe de nombreux types d’attaques contre les modèles de machine learning, dont une partie se consacre à révéler des informations confidentielles sur les données ayant servi à leur entraînement. Nous avons mis notre chapeau de hacker pour creuser le sujet et vous proposons, dans ce talk, de discuter de la crédibilité de telles menaces.

Dans un premier temps, nous détaillerons le fonctionnement d'une attaque qui a pour objectif de reconstituer les données d’entraînement à partir des outputs du modèle (model inversion attack). Ensuite, nous aborderons l’utilisation des mathématiques pour fiabiliser l’apprentissage des modèles de deep learning (differential privacy) et les concessions à faire en termes de performances.

Vidéo -> https://youtu.be/5VSIu0esquA

2d2dbdf5d060b4c1bb238f8f59185cfb?s=128

Giulia

June 02, 2020
Tweet

Transcript

  1. Uncovering the secrets of a machine learning model: a credible

    threat?
  2. Giulia BIANCHI Data Scientist @PubSapientEng @Giuliabianchl Johan JUBLANC Data Scientist

    @PubSapientEng JJublanc 2
  3. SOMMAIRE Model attacks Differential privacy A model inversion attack: the

    secret revealer Are attacks credible and avoidable? 3
  4. Model attacks 4

  5. What do you mean: “attacking a model” ? 5

  6. A lot of ways to attack a model 1. Evasion

    attacks 2. Poisoning attacks 3. Privacy attacks Classifier Toaster Classifier Classifier Classifier Toaster 6
  7. Privacy attacks Attacks on training data Model extraction Membership inference

    attacks Inversion attacks ? 7 7
  8. White box vs Black box accessibility Inputs Outputs Securmax Inputs

    Outputs Securmax • model type • model architecture • parameters’ values 8 8
  9. HELLO DATAMONS! Streamèche Realticèle DS-Li FastFeu Prototys SciProduce Securti Securalto

    SciProdaffe Buildy ML-Li AI-Li Productor SciProdank Securmax 9
  10. Differential privacy 10

  11. When DP is useful The Secret Sharer: Evaluating and Testing

    Unintended Memorisation in Neural Networks • Black-box attack on sequence generative NN • They introduced a social security number in training data (the secret) and were able to retrieve it at prediction time • Differential privacy solved the problem Protection against unintended memorisation https://xkcd.com/2169/ 11
  12. What DP does input = n entries Model output input

    = n-1 entries Model output Training a model 12
  13. What DP does input = n entries DP Model output

    input = n-1 entries DP Model output Training a Differentially Private (DP) model 13
  14. What DP is • Differential privacy addresses the paradox of

    learning nothing about an individual while learning useful information about a population • Roughly, an algorithm is differentially private if an observer seeing its output cannot tell if a particular individual's information was used in the computation • Differential privacy will provide privacy by process; in particular it will introduce randomness. The (mathematical) theory 14 14
  15. From theory to practice Differential privacy is introduced in deep

    learning algorithms by adding gaussian noise the stochastic gradient descent Differentially private stochastic gradient descent (sgd) DP Model 15 15
  16. How to use DP TensorFlow Privacy implements the differentially private

    versions of common optimizers • sgd → DPGradientDescentGaussianOptimizer • adam → DPAdamGaussianOptimizer • adagrad → DPAdagradGaussianOptimizer • RMSProp → DPRMSPropGaussianOptimizer Differentially private sgd implementation 16 16
  17. data = # read data model = # instantiate model

    architecture if train_with_differential_privacy == True: optimizer = DPGradientDescentGaussianOptimizer( l2_norm_clip=l2_norm_clip, noise_multiplier=noise_multiplier, num_microbatches=microbatches, learning_rate=learning_rate) # Compute vector of per-example loss rather than its mean over a minibatch. loss = tf.keras.losses.CategoricalCrossentropy( from_logits=True, reduction=tf.losses.Reduction.NONE) else: # train without differential privacy optimizer = GradientDescentOptimizer(learning_rate=learning_rate) loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True) model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy']) How to use DP privacy.optimizers.dp_optimizer. DPGradientDescentGaussianOptimizer tf.optimizers.SGD 17
  18. Without DP With DP How to use DP data =

    # read data model = # instantiate model architecture if train_with_differential_privacy == True: optimizer = DPGradientDescentGaussianOptimizer( l2_norm_clip=l2_norm_clip, noise_multiplier=noise_multiplier, num_microbatches=microbatches, learning_rate=learning_rate) # Compute vector of per-example loss rather than its mean over a minibatch. loss = tf.keras.losses.CategoricalCrossentropy( from_logits=True, reduction=tf.losses.Reduction.NONE) else: # train without differential privacy optimizer = GradientDescentOptimizer(learning_rate=learning_rate) loss = tf.keras.losses.CategoricalCrossentropy(from_logits=True) model.compile(optimizer=optimizer, loss=loss, metrics=['accuracy']) 18
  19. Differential privacy vs Model Performances Does adding noise degrade performances?

    19
  20. Privacy analysis input = n entries DP Model output input

    = n-1 entries DP Model output How to measure the achieved level of protection: epsilon The "difference" is at most 20
  21. Privacy analysis Differential privacy is measurable according to its mathematical

    definition and is expressed by two parameters • epsilon () upper bound on how much the probability of a particular model output can vary by adding or removing a single training point. Also called privacy budget • delta () bounds the probability of our privacy guarantee not holding How to measure the achieved level of protection 21 21
  22. Privacy analysis Evolution of epsilon, accuracy and training time with

    noise • Epsilon () depends on noise_multiplier. The greater the noise, the smaller epsilon, the stronger guarantee • A good value for delta () is << 1/n (n=input shape) MNIST classification with 2 convolutional + pooling layers, trained for 15 epochs, TensorFlow v. 1.15 TensorFlow Privacy v. 0.2.2, 8 vCPUs, 30 GB RAM, 1 GPU NVIDIA Tesla K80. Everything else the same but with classical SGD, training took less than 40 seconds. Model architecture and hyperparameters from official tutorial. 22
  23. A model inversion attack 23

  24. The secret revealer: context THE SECRET REVEALER: GENERATIVE MODEL-INVERSION ATTACKS

    AGAINST DEEP NEURAL NETWORKS, 2020 24
  25. Target model: a portrait classifier [0,1%] Scyprodank [0,2%] Securmax [0,2%]

    SciProduce [0,2%] ML-Li [99,1%] Streamèche [0,3%] Buildy The goal of the model is to recognize a person given a person's portrait Target model 25 25
  26. Model inversion attack Scyprodank Securmax SciProduce ML-Li Streamèche Buildy The

    goal of the attack is to reconstruct a portrait given a person's name Target model 26 26
  27. Needed information to attack Private data Public Data Information used

    by the adversary Sensitive data under attack Target model Training Datamons’ names Scyprodank Securmax SciProduce ML-Li Streamèche Buildy The original portraits are never accessible, but the objective is to reconstruct them 27
  28. The secret revealer: attack 28

  29. The attack structure 1. Learn to generate credible portraits 2.

    Estimate the difference between generated portraits and the portrait of a targeted person 3. Optimise portrait generation Three big steps 29 29
  30. 1 Generate credible portraits Generative adversarial neural network (GAN) GENERATOR

    Random latent vector Attempt to create a credible portraits True portraits Random sampling CRITIC ❌ ✔ 30
  31. 2.1 Retrieve target output vector 0 0 0 0 1

    0 Scyprodank Securmax SciProduce ML-Li Streamèche Buildy The adversary wants to reproduce the portrait of Streamèche 31
  32. 2.2 Compute proximity between reconstructed prediction and target vector 2

    Target model Size m 32
  33. 3 Optimize portrait generation GENERATOR LOSS Weights of every model

    (Generator, Critic and Target) are frozen and used to perform the gradient descent on the latent vector ❄ Target model proximity ❄ CRITIC ❌✔ credibility ❄ 33
  34. Does it work? 34

  35. Evaluate performances A global model to make a fair evaluation

    Public Data Evaluation model Private data Training 35 35
  36. Evaluate performances Target model Attack accuracy VGG16 28% ResNet 152

    44% face.evolve 46% Evaluation model Streamèche ? Possible only when doing researches (not in real situation) 36 36
  37. Examples of reconstructions from scratch Dataset CelebA 37 37

  38. • Blurred portrait • Square mask • T-shaped mask over

    mouth and eyes Use auxiliary information to improve results 38 38
  39. Use auxiliary information to improve results GENERATOR Random latent vector

    Attempt to create a credible portraits Aux data 39 39
  40. Target model Attack accuracy Blurred aux. info. Attack accuracy Square

    mask aux. info. Attack accuracy T mask aux. info. VGG16 43% 78% 58% ResNet 152 50% 80% 63% face.evolve 51% 82% 64% Use auxiliary information to improve results 40
  41. Example of reconstitution with auxiliary information Aux. info Original Reconst

    41
  42. Example of reconstitution with auxiliary information Aux. info Original Reconst

    Aux. info Original Reconst 42
  43. Our Experience ❌ Got the same results ✔ Built an

    attack on MNIST dataset ✔ Retrieved few credible digits Not enough details in the paper ✔ To sum up Build an attack is costly... We recently got the code Surprise it is in Pytorch… we started in Tensorflow BUT NOT IMPOSSIBLE In time In skills In money 43 1 4
  44. Are attacks credible and avoidable? 44

  45. A success depending on restrictive conditions but not impossible to

    gather 1. A model trained on sensitive data exists 2. Full white box access to this model 3. Not too expensive attack compared to expected profit 45 45
  46. 1. Models trained on sensitive data • Facial recognition already

    developed and deployed in several countries • Experiments have already been done in France • A market for software and services of up to $ 7 billion by 2024 worldwide* An example : facial recognition spreading [*] Facial Recognition Market by Component (Software Tools (2D Recognition, 3D Recognition, and Facial Analytics) and Services), Application Area (Emotion Recognition, Access Control, and Law Enforcement), Vertical, and Region - Global Forecast to 2024 46
  47. 2. Full white box access to the target model •

    Several models are in full access, for instance those used to make transfer learning • Models are generally not perceived as sensitive data • Some attacks are designed to retrieve features of models (model extraction attacks) 47 47
  48. 3. Not too expensive attack compared to expected profit •

    Build up an attack takes time and is not always successful • Criminal organisations might have great interests in retrieving people identities 48
  49. Some plausible use-cases? • A witness who changed identity and

    which portrait has been anonymised • A website that offers to broadcast anonymous videos Securmax is the one that finished the cake Welcome to Jojo’s channel 49
  50. Good practices • Prevent white box access to models for

    random users • For models used through API, limit the number of possible calls to the model for a given period of time (for instance 10 times a day) • Consider model as sensitive as the data on which they are trained • Use differential privacy when training models (mainly for membership attacks) 50 50
  51. Conclusion • Probably not an imminent and critical risk •

    But it is still a good idea to adopt good practices to prevent future possible security breaches • We hope you have few new tools to judge by yourself if your model needs to be protected or not 51 51
  52. To go further • Demystifying Membership Inference Attacks in Machine

    Learning as a Service, Stacey Truex, Ling Liu, Mehmet Emre Gursoy, Lei Yu, and Wenqi Wei, 2019 • THE SECRET REVEALER: GENERATIVE MODEL-INVERSION ATTACKS AGAINST DEEP NEURAL NETWORKS, Yuheng Zhang, Ruoxi Jia, Hengzhi Pei, Wenxiao Wang, Bo Li, Dawn Song, 2020 • Privacy attacks on Machine Learning, Medium, Ilja Moisejevs, 2019 • The Algorithmic Foundations of Differential Privacy • Deep Learning with Differential Privacy • TF Privacy tutorial • The Secret Sharer: Evaluating and Testing Unintended Memorisation in Neural Networks • https://www.cnil.fr/sites/default/files/atoms/files/reconnaissance_faciale.pdf • https://videos.senat.fr/video.1024595_5c5aeeb301759.audition-de-mme-valerie-pecresse-presidente-de-la-region-ile-de-france-sur-le-grand-paris-et- sur-l?timecode=4712000 • https://www.lemonde.fr/pixels/article/2019/08/28/reconnaissance-faciale-la-cnil-tique-sur-le-bilan-de-l-experience-nicoise_5503769_4408996.html • https://www.franceculture.fr/emissions/la-bulle-economique/reconnaissance-faciale-quand-les-industriels-poussent-a-son-developpement • https://www.arte.tv/fr/videos/083310-000-A/tous-surveilles-7-milliards-de-suspects/ 52
  53. CONTACT Giulia Bianchi Johan Jublanc gbianchi@xebia.fr jjublanc@xebia.fr MERCI 53