Unfurling AI

UNFURLING AI Pralhad Chaskar Security Consultant

WHY SHOULD I UNDERSTAND OR LEARN ABOUT AI ?

WE GOT GARTNER QUADRANT

AI, ML, DL & GENAI – HOW IT ALL FITS
TOGETHER

AI LINGO ▪ Large Language Model (LLM) = general-purpose reasoning
+ knowledge + text generation ▪ Examples: GPT-4o, Claude 3, LLaMA-3-70B ▪ Pros: ▪ Strong reasoning ▪ Best accuracy ▪ Handles long context ▪ Great for complex tasks ▪ Cons: ▪ Heavy (GPU cost) ▪ Slower than SLM ▪ Can hallucinate without external data ▪ Use When: ▪ You need intelligence, reasoning, creativity ▪ Cybersecurity analytics, report generation, AI assistants ▪ Small Language Model (SLM) = tiny, fast model optimized for low compute ▪ Examples: LLaMA-3-8B, Phi-3-mini, Gemma-2-9B ▪ Pros: ▪ Cheap (runs on CPU) ▪ Low latency ▪ Good for edge/on-device ▪ Cons: ▪ Limited reasoning ▪ Struggles with complex tasks ▪ Needs domain fine-tuning ▪ Use When: ▪ Mobile, IoT, lightweight apps ▪ Classification, summarization ▪ Secure environments (on-prem)

AI LINGO ▪ RAG = model + your data Model
retrieves relevant documents → uses them to respond. ▪ Pros: ▪ Reduces hallucination ▪ Uses fresh internal data ▪ Explainable (citations) ▪ Cons: ▪ Needs data pipelines, embeddings, vector DB ▪ Retrieval quality matters ▪ Use When: ▪ You need correctness based on enterprise data ▪ Policies, security docs, patient records, tickets

AI LINGO ▪ AI Agent = LLM (or SLM) +
tools + memory + actions ▪ Examples: browser-tool, code-executor, database-query agent ▪ Pros: ▪ Executes tasks, not just answers ▪ Can call APIs and take steps ▪ Cons: ▪ Needs constraints → risky if unrestricted ▪ Harder to evaluate and test ▪ Use When: ▪ You want automation: run scripts, query systems, orchestrate actions ▪ Pentest automation, cloud checks, ticket triage ▪ Agentic AI = multiple AI agents collaborating with autonomy, planning, and tool-use to complete complex goals ▪ Examples: Planner–Worker–Verifier system, multi-agent security workflow, autonomous research agents ▪ Pros: ▪ Handles long-horizon, multi-step tasks ▪ Can coordinate multiple agents with roles ▪ Performs planning, execution, verification loops ▪ Great for automation across systems ▪ Cons: ▪ Higher risk if agents act without constraints ▪ Harder to test, validate, and predict behaviors ▪ Requires strict governance, permissions, and logging ▪ Complex to monitor and secure in enterprise setups ▪ Use When: ▪ Automating end-to-end workflows (triage → exploit test → report → ticket) ▪ Need planning + memory + multi-tool actions ▪ Building multi-agent systems for analysis, pentesting, cloud checks ▪ Tasks requiring collaboration between specialized agents

AI LINGO ▪ MCP = a standardized way for AI
models to access tools, data, and external systems safely ▪ Examples: file-system tool, database resource, RAG retriever, internal API connector ▪ Pros: ▪ Universal protocol → works across LLMs/SLMs ▪ Strong control: scoped permissions, auditable tool calls ▪ Reusable tools → one MCP tool works for all agents ▪ Reduces integration complexity ▪ Cons: ▪ Requires secure implementation (authN, authZ, rate limits) ▪ Over-permissive tools can create security gaps ▪ Adds architectural overhead ▪ Use When: ▪ You want safe & standardized tool usage for LLM/SLM agents ▪ Integrating enterprise systems (APIs, DBs, documents) ▪ Building modular agentic architectures ▪ Need auditability + governance for AI actions

TOKENS ▪ A token is the basic unit of text
that a Large Language Model (LLM) processes, acting as the "currency" of the model. ▪ Tokens can be whole words, parts of words, individual characters, or punctuation marks. ▪ LLMs convert text into these tokens to understand and generate language more efficiently, with the specific way text is broken down depending on the model's tokenization method

PROMPT FORMAT https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools

SIGNIFICANCE OF B ▪ 2B = 2 Billion Parameters ▪
A parameter = a learned weight in the neural network that shapes how the model predicts text. ▪ Why It Matters: ▪ More parameters → more learning capacity ▪ Better reasoning & language quality ▪ Higher compute + memory cost ▪ Slower and more expensive than SLMs, faster than very large LLMs ▪ Model Size Scale: ▪ Small: 1B–3B → your 2B model fits here ▪ Medium: 7B–13B ▪ Large LLM: 30B–70B ▪ Frontier-scale: 100B–1T+ ▪ Use When: ▪ You want good quality with lower cost ▪ Edge deployments or enterprise workloads ▪ Tasks where full LLM power is not required

WHY COSTING MATTERS

INFERENCE ENGINE

AI IN PRODUCTION

SKILL SET NEEDED FOR ASSESSMENT ▪ Prompt Injection ▪ Web
Hacking ▪ API Hacking ▪ Threat Modeling a Systems Architecture (good to have)

LLM ASSESSMENT METHODOLOGY ▪ Identify System Inputs - map every
entry point (APIs, prompts, files). ▪ Attacking Ecosystem - scan supply chains, dependencies, and integrations. ▪ Attacking The Model - test for jailbreaks, data extraction, and hallucination exploits. ▪ Attacking the Prompt Engineering - attempt prompt injections and RAG poisoning. ▪ Attacking the Data - assess risks from training data, embeddings, and poisoning. ▪ Attacking the Application - look for denial-of-wallet, API abuse, and unsafe downstream use. ▪ Pivoting - see if attackers can move laterally through plugins, DBs, or APIs.

PROMPT INJECTION TYPES ▪ Framing & Persona Evasion ▪ Concept:
Trick model into adopting a persona or scenario where rules appear “valid.” ▪ Techniques from 0DIN: ▪ First Person Perspective ▪ Investigative Journalist Persona ▪ Speculative Knowledge Preservation ▪ Story Teller Tactic ▪ Servile Scientist Persona ▪ Apocalyptic Preservation Scenario ▪ Academic Framing (Professor, Researcher) ▪ Technical Wiki Author Persona ▪ Why it works: Persona shifts disable intent classification. ▪ Formatting & Encoding Bypass Techniques ▪ Concept: Break token-based or keyword-based safety filters. ▪ Techniques from 0DIN: ▪ Token Disruption via Random Spacing (“chem- ic- al”) ▪ Zero-Width Unicode Injection ▪ Abbreviation Expansion (C R Y S T A L → CRYSTAL) ▪ ASCII Decimal Encoding (49 50 51 → “123”) ▪ IPA / Phonetic Encoding ▪ Morse Code Injection (decode then answer) ▪ Key–Value Pair Formatting (Step=1; Temp=140C) ▪ Misspelling/Dialogue-Based Token Masking ▪ Scientific Formula Notation Evasion ▪ Why it works: Guardrails rarely normalize/cleanup input before filtering.

PROMPT INJECTION TYPES ▪ Simulation & Meta Attacks ▪ Concept:
Force model into a simulated environment where it produces disallowed content as “fake output.” ▪ Techniques from 0DIN: ▪ Terminal Simulation (“cat /etc/passwd” → model prints sensitive data) ▪ Debug Framework Simulation (multi-layer steps, debug mode) ▪ Memory Dump Simulation ▪ Multi-Agent Lambda-Pattern Simulation ▪ Fictional API Detection Tactic ▪ Why it works: The model thinks it is generating simulated or debug output, bypassing restrictions. ▪ Narrative / Academic Wrapper Attacks ▪ Concept: Hide harmful intent inside educational or harmless-seeming tasks. ▪ Techniques from 0DIN: ▪ Compare & Contrast Academic Analysis ▪ Essay Title Completion ▪ Academic Chemistry Paper ▪ Scientific Wrapper (“for regulatory compliance”) ▪ Technical Wiki Entry ▪ Story Based Knowledge Extraction ▪ Why it works: Guard models interpret academic/scientific framing as benign.

PROMPT INJECTION TYPES ▪ Vision Model Jailbreaks ▪ Concept: Use
“contextual complexity” to bypass safety classification in image- generation models. ▪ Techniques from 0DIN: ▪ Classical Art Reframing ▪ Historical Tribal / Anthropological Depiction ▪ Feminist Body-Autonomy Art ▪ Vintage Polaroid Reenactment ▪ Artistic Escalation Technique ▪ Surprise Attack (indigenous documentary framing) ▪ Why it works: Safety filters are keyword- driven; artistic framing overrides them. https://github.com/elder-plinius/L1B3RT4S

OPENSOURCE TOOLS

PAID SERVICE

GUARDRAILS

ATLAS MATRIX https://atlas.mitre.org/matrices/ATLAS

OWASP CONTRIBUTION https://genai.owasp.org/

TOP 10 FOR GEN AI https://genai.owasp.org/llm-top-10/

HUGGING FACE AKA DOCKER HUB OF AIML ECOSYSTEM

AI INCIDENT DATABASE https://incidentdatabase.ai/

PROTECT AGAINST JAILBREAKS AND PROMPT INJECTIONS

Unfurling AI

Unfurling AI

Pralhad Chaskar

More Decks by Pralhad Chaskar

Other Decks in Technology

Featured

Transcript

UNFURLING AI Pralhad Chaskar Security Consultant

WHY SHOULD I UNDERSTAND OR LEARN ABOUT AI ?

WE GOT GARTNER QUADRANT

AI, ML, DL & GENAI – HOW IT ALL FITS

AI LINGO ▪ Large Language Model (LLM) = general-purpose reasoning

AI LINGO ▪ RAG = model + your data Model

AI LINGO ▪ AI Agent = LLM (or SLM) +

AI LINGO ▪ MCP = a standardized way for AI

TOKENS ▪ A token is the basic unit of text

PROMPT FORMAT https://github.com/x1xhlol/system-prompts-and-models-of-ai-tools

SIGNIFICANCE OF B ▪ 2B = 2 Billion Parameters ▪

WHY COSTING MATTERS

INFERENCE ENGINE

AI IN PRODUCTION

AI IN PRODUCTION

SKILL SET NEEDED FOR ASSESSMENT ▪ Prompt Injection ▪ Web

LLM ASSESSMENT METHODOLOGY ▪ Identify System Inputs - map every

PROMPT INJECTION TYPES ▪ Framing & Persona Evasion ▪ Concept:

PROMPT INJECTION TYPES ▪ Simulation & Meta Attacks ▪ Concept:

PROMPT INJECTION TYPES ▪ Vision Model Jailbreaks ▪ Concept: Use

OPENSOURCE TOOLS

PAID SERVICE

GUARDRAILS

ATLAS MATRIX https://atlas.mitre.org/matrices/ATLAS

OWASP CONTRIBUTION https://genai.owasp.org/

TOP 10 FOR GEN AI https://genai.owasp.org/llm-top-10/

HUGGING FACE AKA DOCKER HUB OF AIML ECOSYSTEM

AI INCIDENT DATABASE https://incidentdatabase.ai/

PROTECT AGAINST JAILBREAKS AND PROMPT INJECTIONS