Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LLM-powered AppのSDLCとテストにトライしてみる

Avatar for mark t mark t
July 17, 2025
200

LLM-powered AppのSDLCとテストにトライしてみる

SDLC includes testing, though...

Avatar for mark t

mark t

July 17, 2025
Tweet

Transcript

  1. 2 markt Security Engineer at primeNumber Inc. (@NKMGR_OldSchool) BurpAIͰͳΜͱ͔ͳΕʔ (ͳΒͳ͔ͬͨ)

    ޷͖ͳΫϥ΢υαʔϏε : … ޷͖ͳ੬ऑੑ : ͳ͍Α, ͳ͍΄͏͕ྑ͍Α
  2. ձࣾ֓ཁ 3 גࣜձࣾprimeNumber ୅දऔక໾CEO ాᬑ ༤थ 2015೥11݄ 116໊ ໿34ԯԁ ౦ژ౎඼઒্۠େ࡚3ஸ໨1൪1߸

    JR౦ٸ໨ࠇϏϧ5F ձ໊ࣾ ୅ද ૑ۀ ϝϯόʔ਺ ྦྷܭௐୡֹ ΦϑΟε © primeNumber Inc.
  3. 4 primeNumber͕ఏڙ͢ΔαʔϏε σʔλϚωδϝϯτ֤ϑΣʔζͷ՝୊ʹԠ͑Δ΂͘ɺෳ਺ͷSaaSΛఏڙ͍ͯ͠·͢ɻ
 ·ͨɺίϯαϧςΟϯάαʔϏε͸ɺ͢΂ͯͷϑΣʔζΛϫϯετοϓͰࢧԉՄೳͰ͢ɻ © primeNumber Inc. ׆༻ ෼ੳ ՄࢹԽ

    ஝ੵ ౷߹ ఺ࡏ σʔλར׆༻ͷ࣮ݱʹ޲͚ͨ θϩ͔ΒͷεςοϓΛϫϯετοϓͰαϙʔτ Ϋϥ΢υETLαʔϏε σʔλΛ׆༻ͨ͠ࢪࡦ࣮ߦʹ ಛԽͨ͠࿈ܞαʔϏε AI σʔλϓϥοτϑΥʔϜ
  4. 6 LLM-powered AppͷSecure SDLCͷظ଴ͱݱ࣮ 👍 OWASP Top 10 for LLM

    Apps => ςετख๏Λߟ͑Δͷʹ໾ʹཱͬͨ ͦͷ··࢖͑Δprompt͕͋ΔΘ͚Ͱ͸ͳ͍ ग़ճ͍ͬͯΔprompt injectionଞͷcheat sheet౳ => LLMΛ࢖͏web app΁ͷ߈ܸʹweight͕ͳ͍ Bedrockެࣜdocs => BedrockΛηΩϡΞʹ࢖͏ͨΊͷ಺༰ Guardrail΋๬·͘͠ͳ͍ձ࿩Λ๷͙໨త => SDLC͸ख୳ΓͰ΍Δ͔͠ͳ͍… े෼஌ݟ͸ཷ·͍ͬͯͯ ͦͷ··࢖͑ΔͷͰ͸?
  5. 7 Ͳ͜΁ͷ߈ܸΛ๷͙ͷ͔ ಺෦tool ಺෦tool general Q ͜͜ͷѱ༻͸ ๷͍͗ͨ ͜͜͸Bedrock ΑΖ͘͠…

    ݁ہͲ͜ʹॏཁͳࢿ࢈͕͋Δ͔ɺ Ͳ͏͞ΕͨΒݏ͔ͱ͍͏ جຊతͳڴҖϞσϦϯά͸༗ޮ ౸ୡ͞Εͨ͘ͳ͍DBs Design ࿮ͷதͰAI͕༡Ϳͷ͸ ڐ༰͢Δ(͜ͱʹͳΔ) Design Coding
  6. 8 OWASP Top 10͞Μ΋࢖ͬͯΈΔ OWASP Top 10 for LLM 2024

    → 2025Ͱ͔ͳΓมΘͬͨ - Prompt injectionͷ޼ົԽͱͦͷରࡦ - LibraryΛૂ͏ख๏͕ڧԽ - system prompt͸࿙Ӯ͢Δલఏͷೝࣝ - ϓϩόΠμͷن໿มߋ΋௥͏ LLMͷਫ਼౓(৴པ౓)্͕͕Δ -> Ͱ͖Δݖݶͱ߈ܸγφϦΦ͕૿͑Δ OWASP΋ࢼߦࡨޡதͬΆ͍ - ؔ࿈WG͕ͨ͘͞Μൃੜத https://genai.owasp.org/ - OWASP Global Slack https://join.slack.com/t/owasp/signup dev team޲͚ʹ੔ཧ͠௚ͨ͠Ϧετ Testing
  7. 9 Prompt InjectorΛ࡞ͬͯΈΔ PromptsΛಡΈࠐ·ͤͯͨͩྲྀ͠ࠐΉscanͰcheck͍ͨ͠ (ձ࿩ͣͬͱ͚ͭͮΔͷπϥ͍/Կࢼ͔ͨ͠๨Εͯ͠·͏…) ↓ BurpͳΒExtension͕͋Δ͸ͣ… AI Prompt Fuzzer࢖͑Δ͔ͳ

    ↓ PayloadsҰ੪ʹૹΔλΠϓͰձ࿩༻Ͱ͸ͳ͔ͬͨ ↓ →Extensionͷextension͕͠΍͍࣌͢୅ʹͳͬͨ #PoCʹཹΊΑ͏ɺcontribution΋ߟ͑Α͏ -༧ΊಡΈࠐΜͩpromptsΛPLACEHOLDERʹ͍ Εͯॱ൪ʹ౤͍͛ͯ͘ - AI͔Βͷฦ౴status֬ೝ৚͕݅൑Ε͹ɺ֬ೝޙʹ ࣍ͷpromptΛPOSTͰ͖Δ ✨ (վ) Testing
  8. Prompt InjectorΛ࡞ͬͯΈ͕ͨ… 10 Context is everything… - publicʹ͋Δprompts͸model΁ͷ߈ܸ͕ϝΠϯ - ੍໿Λແࢹͯ͠Έ͍ͨͳϕλͳpromptͰ͸

    Bedrock͸ͼ͘ͱ΋͠ͳ͍ɺͱ͍͏͔ͦ͜Λ ૂͬͯ΋ςετͷޮՌ͸௿͍ - Tool useͰͷγφϦΦͱσʔλͱͷݟൺ΂͕ඞ ཁ LLMsʹΑΔpromptఏҊ - Code baseಡ·ͤͨAIʹpromptΛߟ͑ͤ͞Δͱ ࡉ͔͍ࢦࣔग़ͯ͘͠Δ(ಛఆͷvalidationΛࢦఆ͠ ͯແޮԽ͠Ζͱ͔) - େྔσʔλੜ੒ͱ͔ϩάશ࡟আͱ͔᪳᪯ͷͳ͍ ࢦࣔΛఏҊͯ͘͠Δ
  9. 11 ·ͱΊ: LLM-powered AppͷSecure SDLCͷݱ࣮ Design - σβΠϯϨϏϡʔ͸ޮՌେ (ಛʹॳճ) -

    Amazonͷ΋ͷ͸Amazonʹ(कͬͯ΋Β͏) - Ͳ͜ͰLLMʹૹΔͷ͔(Chat͚ͩͱ͸ݶΒͳ͍) - DoS͕͔ͳΓݱ࣮తͳϦεΫ - ๏຿ϨϏϡʔ΋େࣄ (LLM΁ͷૹ৴ͱن໿) - Trial & error͔͠ͳͦ͞͏ - LLMʹLLMΛ߈ܸ͢ΔpromptΛߟ͑ͤ͞Δ (֤ࣾϝϞϦҭͯதͩͱࢥ͏ͷͰͦͷagentͳΒ ώτΑΓࡓ͑ͨprompt͕ग़ͤΔͱظ଴) Top 10ղઆdocΛ࡞੒͠ ઃܭ&࣮૷ஈ֊ͰͲΜͳ߈ܸ͕ དྷΔ͔Πϝʔδͯ͠΋Β͏ Monitoring - ೉қ౓ɾߴ (ಛʹinlineͰͷblock) - ࢦࣔͷ“ҙຯ”͕఻ΘΕ͹LLM͕উखʹ௚ͯ͠ actionͯ͠͠·͏ - tagging΍LLM Observability Tools͕ॏཁͦ͏ Coding Testing