Upgrade to Pro — share decks privately, control downloads, hide ads and more …

INTERFACE by apidays 2023 - Securing LLM and NL...

INTERFACE by apidays 2023 - Securing LLM and NLP APIs: A Journey to Avoiding Data breaches, Attacks and More, Ads Dawson & Jared Krause, Cohere

INTERFACE by apidays 2023
APIs for a “Smart” economy. Embedding AI to deliver Smart APIs and turn into an exponential organization
June 28 & 29, 2023

Securing LLM and NLP APIs: A Journey to Avoiding Data Breaches, Attacks, and More
Ads Dawson, Senior Security Engineer at Cohere
Jared Krause, Senior Full Stack Software Developer at Cohere

------

Check out our conferences at https://www.apidays.global/

Do you want to sponsor or talk at one of our conferences?
https://apidays.typeform.com/to/ILJeAaV8

Learn more on APIscene, the global media made by the community for the community:
https://www.apiscene.io

Explore the API ecosystem with the API Landscape:
https://apilandscape.apiscene.io/

apidays

July 11, 2023
Tweet

More Decks by apidays

Other Decks in Programming

Transcript

  1. Language AI Security at the API level: Avoiding Hacks, Injections

    and Breaches Ads Dawson github.com/GangGreenTemperTatum linkedin.com/in/adamdawson0 Jared Krause github.com/kravse kravse.dev
  2. Language AI Security at the API level: Avoiding Hacks, Injections

    and Breaches Ads Dawson github.com/GangGreenTemperTatum linkedin.com/in/adamdawson0 Jared Krause github.com/kravse kravse.dev Or… Research Notes Detailing the Hacks and Attacks in the AI Wild West.
  3. Cohere, one of the leading pioneers in generative AI, runs

    a platform based on state-of-the-art AI models enabling it to provide developers with a range of tools to create customized NLP solutions. Ads Dawson Jared Krause
  4. Lets Cover Some Basics LLM ➠ Large Language Model. These

    are neural networks trained on large collections of data that we use to process and analyze text. NLP ➠ Natural Language Processing The branch of computer science focused on teaching computers to speak.
  5. 9 The Institute for Ethical AI & Machine Learning #

    OWASP Vulnerability MLSecOps Equivalent 1 Broken Access Control Unrestricted Model Endpoints 2 Cryptographic Failures Access to Model Artifacts 3 Injection Artifact Exploit Injection 4 Insecure Design Insecure ML Systems/Pipeline Design 5 Security Misconfigurations Data & ML Infrastructure Misconfigurations 6 Vulnerable & Outdated Components Supply Chain Vulnerabilities in ML Code 7 Identification & Auth Failures IAM & RBAC Failures for ML Services 8 Software and Data Integrity Failures ML Infra / ETL / CI / CD Integrity Failures 9 Logging and Monitoring Failures Observability, Reproducibility & Lineage 10 Server-side Request Forgery ML-Server Side Request Forgery Resources sourced from https://ethical.institute/security.html
  6. Lets focus on a few important LLM API Vulnerabilities 1.

    Prompt Hacking (aka Jailbreaks, adversarial prompting) ➔ What the AI is a Jailbreak? 2. Prompt Injection ➔ CPRF (Cross-Plugin Request Forgery) ➔ Package hallucinations ➔ XSS - Data Exfiltration 3. Training Data Poisoning
  7. Disclaimer! • The information contained in these slides are not

    a direct vulnerability of a specific company, organization or service • It is advised not to perform these types of vulnerability detections without explicit consent
  8. 13 Prompt Injection -> GitHub PWN - “The Confused Deputy”

    Resources sourced from https://embracethered.com/blog/posts/2023/chatgpt-plugin-vulns-chat-with-code/
  9. Prompt Injection in the Wild - LLM hallucination Resources sourced

    from https://vulcan.io/blog Popular techniques for spreading malicious packages 1. Typosquatting 2. Masquerading 3. Dependency Confusion 4. Software Package Hijacking 5. Trojan Package
  10. Prompt Injection -> Data Exfiltration (XSS - Cross-Site Scripting) Resources

    sourced from https://embracethered.com/blog/posts/2023/bing-chat-data-exfiltration-poc-and-fix/ & https://wuzzi.net/posts/data-exfiltration/
  11. ➔ Continually evaluating our models against prompt injection techniques ➔

    Benchmarking this analysis How Can We Mitigate This Risk? Our Journey and Lessons Learnt ➔ Integrating LLM vulnerability scanning into our rigorous testing QA ➔ Regularly testing our LLM’s with prompt injection to verify results
  12. Analysing and Benchmarking Reports for Further Analysis FMEA - Failure

    Mode and Effect Analysis Intentional failures where the failure is caused by an active adversary attempting to subvert the system to attain their goals — either to misclassify the result, infer private training data, or to steal the underlying algorithm. Unintentional failures where the failure is because an ML system produces a formally correct but completely unsafe outcome.
  13. Current frameworks within an early lifecycle of a technology which

    is rapidly developing at scale 🚀 ➔ OWASP Top 10 for Large Language Model Applications ➔ Google’s Secure AI Framework ➔ NIST 1.0 (“AI RMF”) ➔ CSA Security Implications of ChatGPT ➔ Ethical ML Institute ➔ PROPOSED: Fundamental Limitations of Alignment in Large Language Models (Behavior Expectation Bounds (BEB))