Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Security in Machine Learning

Security in Machine Learning

Adversarial attacks, model failures and countermeasures.

Flavio Clesio

August 06, 2020

More Decks by Flavio Clesio

Other Decks in Technology


  1. Flávio Clésio August, 2020 PyCon Africa 2020 Security in Machine

    Learning Adversarial attacks, model failures and countermeasures
  2. ABOUT FLÁVIO CLÉSIO flavioclesio • Machine Learning Engineer @ Germany

    • MSc. in Production Engineering (Computational Intelligence) • Blog: flavioclesio.com • Some conferences that I attended (Strata Hadoop World, Spark Summit, PAPIS.io, The Developers Conference, (Now, PyCon Africa 2020 !)
  3. AIMS Awareness about the latent risks in Machine Learning Development

    and (hopefully) sow the seed of security practices in Machine Learning Engineering. AIMS AND OBJECTIVES OF THIS TALK OBJECTIVES • Demonstrate some attacks/failures using a toy dataset/scripts • Show some countermeasures that can be applied to reduce the attack surface • Bring the related literature for further research in Adversarial Machine Learning This talk will be based, mostly, in the previous work of Kumar, O’Brien, Albert, Viljoen, Snover called “Failure Modes in Machine Learning”
  4. DISCLAIMER References: [29] - The effectiveness of car security devices

    and their role in the crime drop, [30] - Reviewing the effectiveness of electronic vehicle immobilisation: Evidence from four countries, [31] - Do car alarms do any good?, [32] - Adventures in Automotive Networks and Control Units, [33] - Regulating crime prevention design into consumer products: Learning the lessons from electronic vehicle immobilisation, [34] - Does locking your car doors reduce theft?
  5. • Intentional failures where the failure is caused by an

    active adversary attempting to subvert the system to attain their goals – either to misclassify the result, infer private training data, or to steal the underlying algorithm • Unintentional failures where the failure is because an ML system produces a formally correct but completely unsafe outcome [1] FAILURE MODES IN MACHINE LEARNING
  6. “C.I.A. TRIAD” • Confidentiality • Integrity • Availability SOME NECESSARY

    CONCEPTS BLACKBOX/WHITEBOX ATTACKS • Knowledge and clearance about the ML system
  7. COUNTERMEASURES • Inputs monitoring: Histograms, Anomaly detection, confidence bands •

    Coherence check in model deployment • NLP: Word distribution, TF-IDF score shift • Computer Vision: Color density histogram, Human-in-the-Loop INTENTIONAL ATTACKS PERTURBATION ATTACK In perturbation style attacks, the attacker stealthily modifies the query to get a desired response [1]
  8. INTENTIONAL ATTACKS COUNTERMEASURES • Dependency management tools can help (Snyk,

    Poetry, Bandit, Safety, requires.io) • Regular security scanning • If it’s possible, follow security reports • Avoid Wrappers and pre-built environments • Use less dependencies as possible (i.e. avoid the 5 tonnes Sandwich) EXPLOIT SOFTWARE DEPENDENCIES In this attack, the attacker does not manipulate the algorithms. Instead, exploits the software vulnerabilities such as a buffer overflow within dependencies of the system.[1]
  9. UNINTENDED FAILURES COUNTERMEASURES • Monitoring Trinity: Inputs, Model Outputs, KPIs

    • Input monitoring: Histograms, Accepted Variational Bands • Keep-up with exogenous aspects (i.e. respect Causality) • Adaptive Shift Alerting DISTRIBUTIONAL SHIFTS The system is tested in one kind of environment, but is unable to adapt to changes in other kinds of environment [1]
  10. UNINTENDED FAILURES COUNTERMEASURES • Extra layers of checking • Diversity

    in labeled data • Good Data Augmentation strategy • If it’s possible, check the model behavior against corner cases to understand the limits • Error analysis NATURAL ADVERSARIAL EXAMPLES The system misrecognizes an input that was found using hard negative mining [1][2]
  11. CLOSING THOUGHTS • Security/Failure awareness it’s the first step for

    change • Countermeasures are practices to minimize the attack vectors, and consequently, reduce the attack surface • Adjust your countermeasures according to a realistic exposure surface [2]
  12. REFERENCES [1] - Kumar, Ram Shankar Siva et al. Failure

    modes in machine learning systems. arXiv preprint arXiv:1911.11034, 2019. [2] - Gilmer, Justin et al. Motivating the rules of the game for adversarial example research. arXiv preprint arXiv:1807.06732, 2018 [3] - A Complete List of All (arXiv) Adversarial Example Papers by Nicholas Carlini [4] - Adversarial Machine Learning Reading List by Nicholas Carlini [5] - 600,000 Images Removed from AI Database After Art Project Exposes Racist Bias [6] - Xiao, Qixue, et al. "Security risks in deep learning implementations." 2018 IEEE Security and Privacy Workshops (SPW). IEEE, 2018. [7] - TextAttack - Generating adversarial examples for NLP models [8] - Xie, Cihang, et al. "Smooth Adversarial Training." arXiv preprint arXiv:2006.14536 (2020) [9] - Vial, Daniel, Sanjay Shakkottai, and R. Srikant. "Robust Multi-Agent Multi-Armed Bandits." arXiv preprint arXiv:2007.03812 (2020) [10] - Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
  13. REFERENCES [11] - ImageNet: Towards Fairer Datasets: Filtering and Balancing

    the Distribution of the People Subtree in the ImageNet Hierarchy [12] - GANLab [13] - Experimental Security Research of Tesla Autopilot Tencent Keen Security Lab [14] - Learning Robust Models for e-Commerce Product Search [15] - Adversarial Machine Learning Mitigation: Adversarial Learning [16] - A Game Theoretical Approach for Adversarial Machine Learning How to Use Game Theory to address Adversarial Risks? [17] - Nazemi, Amir, and Paul Fieguth. "Potential adversarial samples for white-box attacks." arXiv preprint arXiv:1912.06409 (2019) [18] - Donahue, Jeff, et al. "End-to-End Adversarial Text-to-Speech." arXiv preprint arXiv:2006.03575 (2020) [19] - Deldjoo, Yashar, Tommaso Di Noia, and Felice Antonio Merra. "Adversarial Machine Learning in Recommender Systems: State of the art and Challenges." arXiv preprint arXiv:2005.10322 (2020) [20] - Siva Kumar, Ram Shankar, et al. "Adversarial Machine Learning-Industry Perspectives." Available at SSRN 3532474 (2020)
  14. [21] - Gupta, Abhishek, and Erick Galinkin. "Green Lighting ML:

    Confidentiality, Integrity, and Availability of Machine Learning Systems in Deployment." arXiv preprint arXiv:2007.04693 (2020) [22] - Tricking Neural Networks: Create your own Adversarial Examples [23] - Schneier, Bruce. "Attacking Machine Learning Systems." IEEE Annals of the History of Computing 53.05 (2020): 78-80 [24] - Threat Modeling AI/ML Systems and Dependencies [25] - Ballet, Vincent, et al. "Imperceptible Adversarial Attacks on Tabular Data." arXiv preprint arXiv:1911.03274 (2019) [26] - Adversarial example using FGSM [27] - Adversarial attacks: How to trick computer vision [28] - Attacking Machine Learning with Adversarial Examples [29] - The effectiveness of car security devices and their role in the crime drop [30] - Reviewing the effectiveness of electronic vehicle immobilisation: Evidence from four countries REFERENCES
  15. [31] - Do car alarms do any good? [32] -

    Adventures in Automotive Networks and Control Units [33] - Regulating crime prevention design into consumer products: Learning the lessons from electronic vehicle immobilisation [34] - Does locking your car doors reduce theft? REFERENCES
  16. • Consistency Checks on the object/model: If using a model/object

    it’s unavoidable, one thing that can be done is load this object and using some routine check (i) the value of some attributes (e.g. number of classes in the model, specific values of the features, tree structure, etc), (ii) get the model accuracy against some holdout dataset (e.g. using a holdout dataset that has 85% of accuracy and raise an error in any different value), (iii) object size and last modification. • Use “false features”: False features will be some information that can be solicited in the request but in reality, won’t be used in the model. The objective here it’s to increase the complexity for an attacker in terms of search space (e.g.a model can use only 9 features, but the API will request 25 features (14 false features)). • Model requests monitoring: Some tactics can be since monitoring IP requests in API, cross-check requests based in some patterns in values, time intervals between requests. GENERAL SECURITY COUNTERMEASURES
  17. • Incorporate consistency checks in CI/CD/CT/CC: CI/CD it’s a common

    term in ML Engineering but I would like to barely scratch two concepts that is Continuous Training (CT) and Continuous Consistency (CC). In Continuous Training the model will have some constant routine of training during some period of time in the way that using the same data and model building parameters the model will always produce the same results. In Continuous Consistency it’s an additional checking layer on top of CI/CD to assess the consistency of all parameters and all data contained in ML objects/models. In CC if any attribute value got different from the values provided by the CT, the pipeline will break, and someone should need to check which attribute it’s inconsistent and investigate the root cause of the difference. • Avoid expose pickled models in any filesystem (e.g. S3) where someone can have access: As we saw before if someone got access in the ML model/objects, it’s quite easy to perform some white-box attacks, i.e.no object access will reduce the exposure/attack surface. GENERAL SECURITY COUNTERMEASURES
  18. • If possible, encapsulates the coefficients in the application code

    and make it private: The heart of those vulnerabilities it’s in the ML object access. Remove those objects and incorporates only model coefficients and/or rules in the code (using private classes) can be a good way out to disclose less information about the model. • Incorporate the concept of Continuous Training in ML Pipeline: The trick here it’s to change the model frequently to confuse potential attackers (e.g.different positions model features between the API and ML object, check the reproducibility of results (e.g.accuracy, recall, F1 Score, etc) in the pipeline). • Use heuristics and rules to prune edge cases: Attackers likes to start test their search space using some edge (absurd) cases and see if the model gives some exploitable results and fine tuning on top of that. Some heuristics and/or rules in the API side can catch those cases and throw a cryptic error to the attacker make their job quite harder. GENERAL SECURITY COUNTERMEASURES
  19. • Talk its silver, silence its gold: One of the

    things that I learned in security it’s less you talk about what you’re doing in production, less you give free information to attackers. This can be harsh but its a basic countermeasure regarding social engineering. I saw in several conferences people giving details about the training set in images like image sizing in the training, augmentation strategies, pre-checks in API side, and even disclosing the lack of strategies to deal with adversarial examples. This information itself can be very useful to attackers in order to give a clear perspective of model limitations. If you want to talk about your solution talk more about reasons (why) and less in terms of implementation (how). Telling in Social Media that I keep all my money underneath my pillow, my door has only a single lock and I’ll arrive from work only after 8PM do not make my house safer. Remember: Less information = less exposure. GENERAL SECURITY COUNTERMEASURES