Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Skynet the CTI Intern: Building Effective Machine Augmented Intelligence

Skynet the CTI Intern: Building Effective Machine Augmented Intelligence

The presentation emphasizes integrating LLMs into CTI as enhancements for analysts, showcasing efficiency gains with real-world examples. It underscores LLM limitations, advocating a collaborative, symbiotic relationship between human analysts and LLMs for proactive cybersecurity defense.

Scott J. Roberts

May 09, 2024

More Decks by Scott J. Roberts

Other Decks in Technology


  1. Allow me to Reintroduce Myself • Scott J Roberts •

    Head of Threat Research @ Interpres Security • 20 years of experience including Mandiant, GitHub, Apple, & Splunk • MAI w/Data Analytics @ USU • Adjunct Instructor of Cyber Security @ USU
  2. What This Talk is Not! • No Foundational Models •

    No Releasing Software or Products • A collection of perfect prompts
  3. What This Talk Is! • Talking through pros and cons

    of trying to adopt new AI tools incrementally • Not specific to any particular LLM provider or model (though I used OpenAI & Copilot for most of them) • The Plan • Overall Problem • Problem Use Cases • Conclusion
  4. The Problem™ • Interpres needs to continuously collect, analyze, disseminate

    intelligence data about the entire global galaxy of espionage, attack, and criminal adversaries. • We extract and create relationships to both MITRE ATT&CK and custom Interpres intelligence STIX2 objects.
  5. This Is Chandler M • Former 😭 Intrepres Threat Engineering

    Intern • Utah State University Data Analytics & Information Systems Senior • Head of USU Student Organization for Cybersecurity
  6. Google Cloud - AI and the 5 Phases of the

    Threat Intelligence Lifecycle • Broke down application of LLMs at each stage of the intelligence cycle (with an obvious focus on cyber threats) • Very few details, but enough to come up with a few ideas
  7. Google Cloud - Supercharging security with generative AI • A

    useful breakdown of a major security centric org (Google) taking a multifaceted approach to using LLMs across a wide variety of tools • Less to go off of in the document, but links to a lot of very inspiring posts, such as the work being done by VirusTotal
  8. Thomas Roccia - Applying LLMs to Threat Intelligence • Probably

    one of the single best blog posts about not just ideas, but examples of implementing both basic and advanced LLM techniques to security • Dives Into • Prompt Engineering • Few Shot Prompting • Retrieval Augmented Generation • Tokenization & Embeddings
  9. Use Case #1 Summary Generation - Problem • Challenge: Summarizing

    longer articles for rapid evaluation • Original Solution: Natural Language Toolkit • Problem: Summaries were often very uneven and included content that made them incredibly difficult to read
  10. Use Case #1 Summary Generation - Solution • Experimentation: Programmatic

    summary generation with the OpenAI Conversational Endpoint • Outcome: Cut over fully to using OpenAI based summaries
  11. Use Case #2: One O, Data Generation - Problem •

    Challenge: As part of improving the service we often need one off data for enrichment • Original Solution: Manual Effort • Problem: Demonym to ISO3166-3 mapping
  12. Demonym Country Name ISO3166-3 Chinese The People's Republic of China

    CHN Iranian The Islamic Republic of Iran IRN North Korean Democratic Peoples Republic of Korea PRK Russian Russian Federation RUS
  13. Use Case #2: One Off Data Generation - Solution •

    Experimentation: Asking directly in the Chat interface for the data we want • Outcome: Mixed, eventually with good prompting accurate results that required manual manipulation
  14. Use Case #3: ATT&CK Technique Extraction - Problem • Challenge:

    Finding references to techniques in vendor blog posts • Original Solution: Regex Based Extraction • Problem: Not everyone gives nice and easy tables
  15. Found True Positive False Positive False Negative T1133 T1133 T1078.002

    T1078.002 T1059.003 T1059.003 T1078.003 T1078.003 T1078.002 T1078.002 T1543.003 T1543.003 T1562.001 T1562.001 T1560.001 T1560.001 T1219 T1219 T1105 T1105 T1537 T1537 T1486 T1486
  16. Found True Positive False Positive False Negative T1133 T1133 T1078.002

    T1078.002 T1059.003 T1059.003 T1078.003 T1078.003 T1078.002 T1078.002 T1543.003 T1543.003 T1562.001 T1562.001 T1560.001 T1560.001 T1219 T1219 T1105 T1105 T1537 T1537 T1486 T1486
  17. Use Case #3: ATT&CK Technique Extraction - Solution • Experimentation:

    • Direct Call to OpenAI Conversation API: Failure • Langchain Tool Mode with OpenAI: Failure • Prompting in the OpenAI Chat Interface: Success • Outcome: Needs more experimentation
  18. Use Case #4: STIX2 Object Merging - Problem Challenge: We

    often create ATT&CK objects for the “same” group and need to cluster them after the fact • Original Solution: Lots of manual efforts • Problem: Manual effort is time consuming and error prone
  19. Use Case #4 STIX2 Object Merging - Solution • Experimentation:

    Programmatic merging of STIX2 objects using the OpenAI • Outcome: Success! Shockingly effective in testing, still being investigated to operationalize
  20. Code Assistance with GitHub Copilot • Increased Productivity: Speeds up

    the coding process by providing intelligent code suggestions and completions. It can help “good” developers write code faster, reducing the time spent on repetitive tasks and boilerplate code. • Code Quality Improvement: Copilot can assist developers in writing code that follows best practices and coding conventions. • Easy Access to a LLM in Code, even for Data: Given the access to base LLMs it allows easy generate of data, especially when you’re creating data in code.
  21. In The End • Case #1 – Summary Generation •

    Successful • Operational • Case #2: One Off Data Generation • Successful… Mostly • Used as Necessary • Case #3: ATT&CH Technique Extraction • Mixed Success • More Research Needed • Case #4: STIX2 Object Merging • Successful (Unexpectedly) • More Research Needed
  22. What about Chandler M? • Still around Utah State •

    Now with another internship this summer • Not replaced by AI no matter how hard I tried • Proved the value of not just interns but early career folks in CTI
  23. Conclusions • While LLMs cannot do everything they can often

    be effective at specific tasks • Speculation is usually somewhat useless but code wins • If at first you don’t succeed try and try again • Different Services • Different Models • Different Hyperparameters • We’re going to continue integrating and start looking at fine tuning our own models