Designing and building machine learning systems require a lot of skill, time, and experience. Data scientists, developers, and ML engineers work together in building ML systems and pipelines that automate different stages of the machine learning process. Once the ML systems have been set up, these systems need to be secured properly to prevent these systems from being hacked and compromised.
ML systems are generally built using Python and some attacks have been customized to take advantage of vulnerabilities present in certain Python libraries such as Joblib, urllib, and PyYAML. Other attacks may take advantage of vulnerabilities present in the custom code of ML engineers as well. In addition to these, we'll take a look at certain attack vectors available for certain cloud SDKs (e.g., SageMaker Python SDK) available in Python. There are different ways to attack machine learning systems and most data science teams are not equipped with the skills required to secure the systems they built. In this talk, we will discuss in detail the cybersecurity attack chain and how this affects a company's strategy when setting up different layers of security. We will discuss the different ways ML systems can be attacked and compromised and along the way, we will share the relevant strategies to mitigate these attacks. This includes attacks performed in deployed custom APIs (ML inference endpoints) built using known Python frameworks (e.g., Flask, Pyramid, Django) along with serverless applications and architectures written in Python (e.g., Chalice).
Finally, we will show how to review and assess new discovered vulnerabilities in Python libraries and packages. We will share some tips and techniques on how to check if any of your ML systems and environments are vulnerable to certain types of attacks. We'll do these by sharing some examples using ML frameworks such as PyTorch and TensorFlow.