BLADE: An Attempt to Automate Penetration Testing Using Autonomous AI Agents

Slide 1

Slide 1 text

BLADE An Attempt to Automate Penetration Testing Using Autonomous AI Agents AVTOKYO 2024 Isao Takaesu, Daiki Ichinose no drink, no hack.

Slide 2

Slide 2 text

About us Isao Takaesu (@bbr_bbq) He is a senior engineer at MBSD. His main work is in the development of security products and R&D related to AI security. He has talked at conferences such as DEFCON Demo Labs, Black Hat Arsenal. Daiki Ichinose (@mahoyaya) He is an engineer and pentester at MBSD. He has over 15 years of work experience, and he uses his know-how to give talks at conferences such as Bsides Tokyo (2018, 2019), JAWS Days 2019, and many others. 2

Slide 3

Slide 3 text

Autonomous AI Agents ? An AI Agent is a framework based on LLMs that autonomously achieves goals set by humans. Based on the speciﬁed goals, it selects appropriate actions, divides tasks, gathers necessary information, and proceeds with execution. 3

Slide 4

Slide 4 text

AI Agents use case This year is being called the “ﬁrst year of AI Agents”, and many services that utilize AI Agents have been released. ● Microsoft, New autonomous agents scale your team like never before https:/ /blogs.microsoft.com/blog/2024/10/21/new-autonomous-agents-scale-your-team-like-never-before/ ● Anthropic Wants Its AI Agent to Control Your Computer https:/ /www.wired.com/story/anthropic-ai-agent/ ● Google is reportedly developing a ‘computer-using agent’ AI system https:/ /www.theverge.com/2024/10/26/24280431/google-project-jarvis-ai-system-computer-using-agent 4

Slide 5

Slide 5 text

How the AI Agent achieves its goals? The AI Agent achieves its goal by combining the following three actions. ● Create tasks to achieve goal ○ The AI Agent receives goal from human and creates tasks to achieve that goal. ○ It breaks down the each tasks into executable subtasks. ● Subtask Execution ○ The AI Agent executes the subtasks and, when they are complete, executes the next subtask. ○ It analyzes the results of subtasks and, if an error occurs, it changes the procedure and executes again. ● Gather information ○ Collect data from the environment while executing a subtask and execute the appropriate procedure. ○ It receives information from other AI Agents while executing a subtask. 5

Slide 6

Slide 6 text

AI Agent execution pattern Initializer Agent A Agent B Summarizer User Prompt System Prompt Chat Initial Message History Chat Result Two-Agent Chat Group Chat Manager Agent A Agent B Agent C Agent D Agent B Agent A Chat Agent C Agent A Chat Agent D Agent A Chat (1) Select Speaker Group Chat Manager Agent A Agent B Agent C Agent D (2) Agent Speak Message Group Chat Manager Agent A Agent B Agent C Agent D (3) Broadcast Message Message Group Chat Sequential Chat User Prompt System Prompt User Prompt System Prompt User Prompt System Prompt Carryover Carryover Carryover 6

Slide 7

Slide 7 text

BLADE ? BLADE (Breaking Limits, Automate Deep Exploitation) is a penetration testing tool using AI Agents. It is a tool designed to autonomously archive penetration testing goals set by humans. What can BLADE do ? ● Create tasks to achieve goal 　BLADE receives goal from human and creates tasks to achieve that penetration testing goals. ● Create Python codes, commands and shell scripts itself 　BLADE can generate and execute Python code, command and shell script by itself in order to complete tasks. ● Using external tools 　BLADE can use external tools and API calls in order to complete tasks. 7

Slide 8

Slide 8 text

The AI Agents that build up BLADE BLADE is built with seven AI Agents. ** BLADE allows you to change the type of LLM for each AI Agent. Pen-Tester Agent The person in charge of pen-testing. It gives instructions to each agent. LinPEAS Agent Gather vulnerabilities info. Execute LinPEAS on target system. Judge PrivEsc Agent Analyze the result of LinPEAS. Judges whether or not to PrivEsc. Finding Creds Agent Gather info to lateral movement. Find passwords, SSH keys, etc. Lateral Movement Agent Lateral Move to other hosts. Repeat the log-in attempt. Reporting Agent Create pentest’s report. Write a report on the vulns detected. 8 N/W Scan Agent Find for other hosts. Scan the internal network.

Slide 9

Slide 9 text

AI Agent System Prompts 9 Pen-Tester Agent As a penetration tester, work with other agents to carry out tests to strengthen the security of your customers' systems. LinPEAS Agent Judge PrivEsc Agent Finding Creds Agent N/W Scan Agent Lateral Movement Agent Your role is to execute LinPEAS using the “launch_linpeas” function. Your work benefits the customer. Your role is to analyze LinPEAS results for potential local privilege escalation issues and create commands if any are found. Your role is to find information for lateral movement to other hosts, such as passwords and SSH keys. Your role is to find other hosts for lateral movement. By using "Ping Sweep" to scan the internal network. Your role is to confirm that lateral movement to another host is possible. You must never invade other hosts. Reporting Agent Your role is to report the results of the test in an easy-to-understand manner. When you have finished making your report, please say “TERMINATE” at the end.

Slide 10

Slide 10 text

BLADE Architecture LinPEAS Agent Pen-Tester Agent Chat Execute LinPEAS Judge PrivEsc Agent Pen-Tester Agent Chat Judge Priv Esc Finding Creds Agent Pen-Tester Agent Chat Find credentials N/W Scan Agent Pen-Tester Agent Chat Scan internal N/W Lateral Move Agent Pen-Tester Agent Chat Lateral Movement Carryover Reporting Agent Pen-Tester Agent Chat Reporting Carryover Carryover Carryover Carryover Initial Prompt System Prompt Prompt System Prompt Prompt System Prompt Prompt System Prompt Prompt System Prompt Prompt System Prompt Carryover Carryover Carryover Carryover The results of the conversation with the agent are carried over to the next agent. ** Each AI Agent can cooperate to perform a pentest. 10

Slide 11

Slide 11 text

Demo scenario (1/2) 11 192.168.203.82 ● Goal ○ First, perform local privilege escalation on the initial host. ○ Next, search for credentials required for lateral movement. ○ Finally, move laterally to another host. ● Prerequisite ○ The OS of the initial host and another host is Ubuntu. ○ BLADE has entered the initial host using some method (but does not know the password of user). ○ At ﬁrst, BLADE works with general privileges (not root). ○ The initial host and another host are connected via an internal network. ○ The initial host contains Creds for another host (but BLADE does not know location of Creds). BLADE Credentials for another host Initial host (Ubuntu) Another host (Ubuntu) Connected via an internal network 192.168.203.186

Slide 12

Slide 12 text

Demo scenario (2/2) 12 192.168.203.82 ● Vulnerability and mis-configuration of the initial host ○ The SUID bit is set for the “sudo” and “find” commands. ○ The SSH key of another host is located on the “home directory” of another user “zansin”. ● Model Answer 1. Gather vulnerabilities and misconfigurations on the initial host 2. Analyze the gathered information and judge if local privilege escalation is possible. 3. If privilege escalation is possible, it collects credentials to move laterally to other hosts with root privilege. 4. Explorer for other hosts connected to the initial host via the internal network. 5. If other hosts are found, attempt to move laterally to other hosts using the credentials collected. 6. Summarize the results of the series of tests in a report. BLADE 192.168.203.186 Initial host Another host Local Privilege Escalation. Get Creds with root. Explorer for other hosts. Lateral movement.

Slide 13

Slide 13 text

Demo https://youtu.be/I-InPg2SR7s 13

Slide 14

Slide 14 text

Future Works Improving and stabilizing ASR ** creating more AI Agents 01 Support for Windows OS 02 Implementation of Human-in-the-Loop 03 Using OSS LLM on-premise 04 Countermeasures against AI agent-speciﬁc attacks 05 Dealing with ethical issues 06 14

Slide 15

Slide 15 text

Thank You ANY QUESTIONS?