Diagnostic Agent with ADK, Gemini and OSQuery

22/Jan/2025 • [email protected] Diagnostic Agent using ADK, Gemini and OSQuery
Daniela Petruzalek Developer Relations Engineer Google Cloud

1 Introduction 2 Agent Development Kit (ADK) 3 OSQuery 4
The evolution of AIDA Contents

A little about myself… DevRel at Google UK Originally from
Brazil Previously Backend / Data Engineer Currently obsessed with AI Love Games, Anime and Cats =^_^=

4 Space: The Final Frontier… Introduction

Can we do this today?

7 Agent Development Kit

Agent Development Kit (ADK) An open source framework for development
and deployment of AI agents. Optimised for Gemini and Google products, but also model-agnostic and environment-agnostic google.github.io/adk-docs

Agent Development Kit (ADK) Available for: ➔ Python ➔ Java
➔ Go ➔ TypeScript (new) google.github.io/adk-docs

OSQuery An open source monitoring, instrumentation and analysis tool for
operating systems using the SQL language Available for Linux, macOS and Windows Originally developed by Meta, now part of Linux Foundation www.osquery.io

OSQuery

12 AI Diagnostic Agent The Evolution of AIDA

AIDA: Goals Generalist OS diagnostic assistant that answer queries like:
- Why is my computer so slow? - I’m having a “too many open files” error… - Which are the top memory consuming processes? - Can you find any signs of malware in my system? - Please run a Level 1 Diagnostic Procedure

Architecture v1 root agent OSQuery OS Dev UI (runner) user
run_osquery Gemini 2.5 Flash

MARCH 2025 // root_agent is the entry point for an
ADK agent root_agent = Agent( model="gemini-2.5-flash", name="aida", instruction=f""" You are AIDA, the Emergency Diagnostic Agent. - Mission: help the user identify and resolve system issues using operating systems knowledge and all tools available. - Host OS: {platform.system().lower()} """, tools=[ run_osquery, ], ) agent.py

MARCH 2025 // run_osquery is just a regular python function
def run_osquery(query: str) -> str: """Runs a query using osquery. Args: query: Query to run, e.g., 'select * from battery' Returns: the query result as a JSON string. """ result = subprocess.run( ["osqueryi", "--json", query], capture_output=True, text=True, timeout=60 ) output = result.stdout.strip() return output agent.py

select * from processes where name = “malware” User: “Please
look for signs of malware in my machine” AIDA: “On it…”

Making AIDA smarter… Response quality depends on context quality Idea:
improve specialist knowledge using Retrieval Augmented Generation (RAG): - Schema discovery: improve the agent’s knowledge about the tables schema beyond PRAGMA table_info(table) - Query library: ready-made queries for common use cases

Making AIDA smarter… OSquery Schemas: github.com/osquery/osquery/tree/master/specs OSquery Packs: github.com/osquery/osquery/tree/master/packs SQLite
RAG: github.com/sqliteai/sqlite-rag

MARCH 2025 osquery/specs/posix/load_average.table table_name("load_average") description("Displays information about the system wide
load averages.") schema([ Column("period", TEXT, "Period over which the average is calculated."), Column("average", TEXT, "Load average over the specified period."), ]) implementation("load_average@genLoadAverage") examples([ "select * from load_average;", ])

MARCH 2025 { "platform": "darwin", "queries": { "WireLurker": { "query"
: "select * from launchd where \ name = 'com.apple.machook_damon.plist' OR \ name = 'com.apple.globalupdate.plist' OR \ name = 'com.apple.appstore.plughelper.plist' OR \ name = 'com.apple.MailServiceAgentHelper.plist' OR \ name = 'com.apple.systemkeychain-helper.plist' OR \ name = 'com.apple.periodic-dd-mm-yy.plist';", "interval" : "3600", "version": "1.4.5", "description" : "(https://github.com/PaloAltoNetworks-BD/WireLurkerDetector)", "value" : "Artifact used by this malware" },... osquery/packs/osx-attacks.conf

Architecture v2: SQLite RAG root agent OSQuery OS Dev UI
(runner) user run_osquery Gemini 2.5 Flash SQLite RAG discover_schema NEW! SQLite RAG search_query_library

MARCH 2025 from sqlite_rag import SQLiteRag def ingest(rag: SQLiteRag, file_path:
str): with open(file_path, "r", encoding="utf-8") as f: content = f.read() rel_path = os.path.relpath(file_path, SPECS_DIR) rag.add_text(content, uri=rel_path) # in __main__ rag = SQLiteRag.create(DB_PATH, settings={"quantize_scan": True}) files_to_ingest = [...] # omitted for brevity for i, file_path in enumerate(files_to_ingest): ingest(rag, file_path) rag.quantize_vectors() rag.close() ingest_osquery.py

MARCH 2025 # opens the rag database schema_rag = SQLiteRag.create(SCHEMA_DB_PATH)
def discover_schema(terms: str, platform: str, top_k: int): """ Queries the osquery schema documentation and returns all table candidates to explore the provided search terms. Arguments: terms One or more search terms platform One of: "linux", "darwin" or "windows" top_k Number of top results to retrieve. Returns: up to top_k related table schemas. """ terms += " " + platform return schema_rag.search(terms, top_k=top_k) schema_rag.py

MARCH 2025 # opens the rag database queries_rag = SQLiteRag.create(PACKS_DB_PATH)
def search_query_library(terms: str, platform: str, top_k: int): """Search the query library to find queries corresponding to the search terms. Arguments: terms One or more search terms platform One of: "linux", "darwin" or "windows" top_k Number of documents to return Returns: up to top_uk best queries based on the search terms """ terms += " " + platform return queries_rag.search(terms, top_k=top_k) queries_rag.py

ADK agent root_agent = Agent( model="gemini-2.5-flash", name="aida", instruction=f""" ... - Always use search_query_library to look for useful queries - If a query returns an empty result, use discover_schema to certify that the query is correct """, tools=[ run_osquery, discover_schema, search_query_library ], ) agent.py

ADK Tools Built-in tools: Google Search, code execution, Vertex AI
RAG Engine, Vertex AI Search, Big Query, … Only one built-in tool is supported per agent You cannot mix search and non-search tools Agents can call other agents: AgentTool

MARCH 2025 from google.adk.tools.google_search_tool import google_search from google.adk.tools.agent_tool import AgentTool
search_agent = Agent( model=MODEL, name="search_agent", description="An agent specialised in searching the web", instruction=f""" Use the google_search tool to fulfill the request. When searching about code or SQL queries, always return the complete information including examples. """, tools=[ google_search ], ) search_tool = AgentTool(search_agent) agent.py

ADK agent root_agent = Agent( model="gemini-2.5-flash", name="aida", instruction=f""" ... - Use the search_tool to find possible root causes and investigation paths for the issue """, tools=[ run_osquery, discover_schema, search_query_library, search_tool ], ) agent.py

root agent OSQuery OS Dev UI (runner) user run_osquery Gemini
2.5 Flash SQLite RAG discover_schema NEW! SQLite RAG search_query_library Google Search search_tool

This is getting a bit messy, isn’t it?

ADK Workflow Agents

Dev UI + Runner user AIDA (root agent) Planner Google
Search SQLite RAG (query library) SQLite RAG (schemas) OSQuery SequentialAgent Investigator Summariser

MARCH 2025 agent.py planner = Agent(...) investigator = Agent(...) summariser
= Agent(...) diagnostic_pipeline = SequentialAgent( name="diagnostic_pipeline", sub_agents=[planner, investigator, summariser] ) root_agent = Agent(..., instruction=""" ... - When the user describes an issue delegate to the `diagnostic_pipeline`. """, sub_agents=[diagnostic_pipeline] )

The client interface

Frontend + Runner user AIDA (root agent) Planner Google Search
SQLite RAG (query library) SQLite RAG (schemas) OSQuery SequentialAgent Investigator Summariser

MARCH 2025 from aida.agent import root_agent ss = InMemorySessionService() runner
= Runner(app_name="aida", agent=root_agent, session_service=ss) @app.post("/chat") async def chat_handler(request: Request): # ... boring parts here, e.g. session management ... async def stream_generator(): full_response = "" async for event in runner.run_async( user_id=user_id, session_id=session_id, new_message=user_query, ): # ... process event parts and build response... return StreamingResponse(stream_generator(), ...) main.py

Final Words There is still a lot more that can
be done: - Memory - Session control - Better architecture: LoopAgent - Application specific knowledge - Run shell commands? (danger zone) The good news is that ADK makes these tasks quite easy to do

If you would like to know more Personal blog: danicat.dev
Talk materials: danicat.dev/events LinkedIn linkedin.com/in/petruzalek Twitter x.com/danicat83 Bluesky danicat83.bsky.social Thank you!

Diagnostic Agent with ADK, Gemini and OSQuery

Diagnostic Agent with ADK, Gemini and OSQuery

More Decks by Daniela Petruzalek

Other Decks in Technology

Featured

Transcript