Upgrade to Pro — share decks privately, control downloads, hide ads and more …

​Securing the Mind of Machines : GenAI Security...

​Securing the Mind of Machines : GenAI Security & Trust Frameworks

Title: ​Securing the Mind of Machines : GenAI Security & Trust Frameworks
Presenter: Harsh Tandel
Event: BreachForce CyberSecurity Cohort
Talk Date: 21st-December-2024

Key Takeaways: ​AI models keep updating but how do you test, trust, and secure something that keeps changing?

Avatar for BreachForce

BreachForce

July 15, 2025
Tweet

More Decks by BreachForce

Other Decks in Technology

Transcript

  1. Who Am I • CVE 2022-3343 • P1 Warrior Bugcrowd

    • Bugbase Global Top 200 • Integrity Global Top 1000 • Speaker,Trainer,Blog Writer • Security Consultant & Researcher • Awarded by MOD UK, Dutch Tax Admin, Indian police
  2. Neural Network:A neural network is a deep learning technique designed

    to resemble the structure of the human brain. It requires large data sets to perform calculations and create outputs, which enables features like speech and vision recognition. Natural Language Processing (NLP):A field of AI that enables computers to understand and generate human language. Machine Learning (ML):A subset of AI that focuses on algorithms that can learn from data without explicit programming. Federated learning is a machine learning technique that allows multiple entities to collaboratively train a model without sharing their raw data. Deep learning : Deep learning is a subset of machine learning that focuses on utilizing multilayered neural networks to perform tasks such as classification, regression, and representation learning. Large Language Model (LLM):A type of AI model that is trained on large amounts of text data to generate human- like text. Retrieval-Augmented Generation (RAG) :It is a technique that enhances the accuracy and relevance of large language model (LLM) responses by integrating them with external knowledge sources Conventional AI: AI also known as narrow or weak AI, is designed for specialized tasks.Conventional AI relies heavily on data-driven processes, leveraging algorithms and ML techniques to perform tasks. Basic Terminology
  3. Why Gen AI Security ? • Everyone is using Gen

    AI to create arts,make decision,enhance their project.Every firm is rushing to integrate AI in their products and systems.It is making it crucial to consider it’s security.AI Models and MCP Servers expand the new attack surface,vulnerabilities beyond traditional codes. • The generative AI market is experiencing rapid growth, with a projected market size of $66.89 billion in 2025 and a forecasted compound annual growth rate (CAGR) of 36.99% between 2025 and 2031, leading to a market volume of $442.07 billion by 2031. • A study by Menlo Security showed that 55% of inputs to generative AI tools contain sensitive or personally identifiable information (PII), and found a recent increase of 80% in uploads of files to generative AI tools, which raises the risk of private data exposure. • Gartner Press Release, “Gartner Predicts 40% of AI Data Breaches Will Arise from Cross-Border GenAI Misuse by 2027,” February 17, 2025. • Generative AI (GenAI) red teaming is crucial for identifying and mitigating vulnerabilities in AI systems before they can be exploited by malicious actors. By simulating attacks and
  4. • Prompt Injection: Tricking the model into breaking its rules

    or leaking sensitive information. • Bias and Toxicity: Generating harmful,offensive,or unfair outputs. • Data Leakage: Extracting private information or intellectual property from the model. • Data Poisoning: Manipulating the training data that a model learns from to cause it to behave in undesirable ways. • Hallucinations: The model confidently provides false information. • Agentic Vulnerabilities: Complex attacks on AI “agents” that combine multiple tools and decision making steps. • Supply Chain Risks: Risks that stem from the complex, interconnected processes and interdependencies that contribute to the creation, maintenance, and use of models. • Jailbreaking: Jailbreaking is the process of utilizing specific prompt structures,input patterns,or contextual cues to bypass the built-in restrictions or safety measures of LLMs. • Demo Time : Immersive GPT Red Teaming Gen AI
  5. Frameworks • DAN (Do anything) Jailbreak prompt • DeepTeam (The

    LLM Red Teaming open-source Framework) • Harm Bench (A Standardized Evaluation Framework for Automated Red Teaming) • Play Ground • Responsible AI (RAI):focuses on developing and deploying AI systems that are ethical, transparent, and aligned with human values, prioritizing fairness, accountability, and respect for privacy. • Google's Secure AI Framework (SAIF) : Google's SAIF is a conceptual framework designed to help organizations build and deploy secure AI systems • NIST AI RISK MANAGEMENT FRAMEWORK(RMF) : The NIST AI Risk Management Framework (AI RMF) is a guide designed to help organizations manage AI risks at every stage of the AI lifecycle—from development to deployment and even decommissioning.
  6. Standards and Acts For AI Standards • ISO/IEC 42001:2023: AI

    security and management which provides a framework for organizations to manage AI responsibly and ethically. • ISO/IEC TR 27563:2023 is a technical report that provides best practices for assessing security and privacy in artificial intelligence (AI) use cases. • ISO/IEC DIS 27090 Guidance for addressing security threats to artificial intelligence systems. Global AI Act • The European Union's AI Act • Artificial Intelligence & Data Act (AIDA),Canada • Singapore's Model AI Governance Framework
  7. Content Filters : AI content filters are systems designed to

    detect and prevent harmful or inappropriate content. They work by evaluating input prompts and output completions, using neural classification models to identify specific categories such as hate speech, sexual content, violence, and self-harm e.g., in Azure AI Foundry, Vertex AI Meta prompt : A meta prompt, or system message, is a set of natural language instructions used to guide an AI system's behavior (do this, not that). A good meta prompt would say "if a user requests large quantities of content, only return a summary of those search results. Guardrails:Guardrails are mechanisms and frameworks designed to ensure that AI systems operate within ethical,legal, and technical boundaries.They prevent AI from causing harm, making biased decisions, or being misused. LLM Guard: The Digital Firewall for Language Models, By offering sanitization, detection of harmful language, prevention of data leakage, and resistance against prompt injection attacks. Data security posture management (DSPM): DSPM identifies sensitive data across cloud and services,it continuously monitors data security, identifies risks, assesses vulnerabilities and provides remediation strategies. MCP Scan : Security scanner for Model Context Protocol (MCP) servers. Scan for common vulnerabilities AI Security Solution
  8. .