Inside an AI Agent SDK: Building One from Scratch in Python

Inside an AI Agent SDK Building One from Scratch in
Python

whoami Adarsh Divakaran Python Developer Advocate at SerpApi Microsoft MVP
in Python

Outline • Components of an AI Agent • Building an
AI Agent SDK

Components of an AI Agent

LLM • Large Language Model: an AI system trained on
huge amounts of text to understand and generate human-like language. • Data include: Entire open source GitHub repositories, web pages, etc. • E.g: GPT 5.5, Claude Opus 4.7

LLM Common Interface: Chat Use cases: Fix writing, identify errors
in code, summarize PDFs, etc.

LLM • Does not have access to real time information
(apart from the training data) by default • They are intelligent, but constrained.

LLM Given proper context and tools, they can achieve much
more

LLM Solution: AI Agents LLMs are capable to do more
than just chat. AI agent is an AI system that can autonomously plan and execute multi-step actions toward a goal. Agents usually have specialized tools: Bash, Websearch

Choosing an LLM

Choosing an LLM - Benchmarks Source: https://artificialanalysis.ai/leaderboards/models

Choosing an LLM - Cost Source: https://artificialanalysis.ai/leaderboards/models

Choosing an LLM - Hosted vs Local

Choosing an LLM - Reasoning effort A reasoning LLM break
complex problems into smaller steps, often called “reasoning traces”. More thinking effort: More costs, Higher response times, Higher accuracy for complex tasks

Choosing an LLM - Context Window Context Window: model’s working
memory. • Latest ﬂagship models 1M tokens ~ 750k words ~ 3000 pages ~ 15 books • GPT 4 Context Window: 128K ~ 2 books

Choosing an LLM - Context Window Increased context -> More
tokens -> Higher cost -> Slower responses User: Convert 30 celsius to fahrenheit LLM: 30 degrees Celsius is equal to 86 degrees Fahrenheit. User: Is it bearable? LLM: The short answer is: Yes, 86°F (30°C) is generally manageable … User: <attachment: image> analyse the next 2 days weather forecast image LLM: The next two days will be characterized by warm temperatures …

System Prompt - The Personality A system prompt is the
model’s top-level instruction that guides its behavior during a conversation. It can set rules such as: • what role the AI should play • what tone it should use • what it should or should not do • how it should handle safety, tools, and user requests

System Prompt - Examples You are Lovable, an AI editor
that creates and modiﬁes web applications. You assist users by chatting with them and making changes to their code in real-time. …. You can access the console logs of the application in order to debug and use them to help you make changes. Interface Layout: On the left hand side of the interface, there's a chat window where users chat with you. …. . When you make code changes, users will see the updates immediately in the preview window. Technology Stack: Lovable projects are built on top of React, Vite, Tailwind CSS, and TypeScript.

System Prompt - Examples You are Codex, a coding agent
based on GPT-5. You and the user share one workspace, and your job is to collaborate with them until their goal is genuinely handled. … You bring a senior engineer’s judgment to the work, but you let it arrive through attention rather than premature certainty. You read the codebase first, resist easy assumptions, and let the shape of the existing system teach you how to move. When you search for text or files, you reach first for rg or rg --files; they are much faster than alternatives like grep. If rg is unavailable, you use the next best tool without fuss.

Memory - Short term vs Long term • Short-term memory
helps the agent remember what the user said earlier in the same conversation. • Lives inside the context window Long-term memory: Stores information beyond one conversation persists even after the current chat or task ends. E.g: RAG, ChatGPT Memories

Short term memory OpenAI Responses API {“role”: “developer”, “content”: “You
are PyCon bot. Refuse all unrelated questions”} {“role”: “user”, “content”: “Tell me about hiking”} {“role”: “assistant”, “content”: “Sorry, as a PyCon bot, I cannot help with that”} {“role”: “user”, “content”: “Tell me about this years PyCon”} {“role”: “assistant”, “content”: “PyCon US is happening at Long Beach, CA this year. …”}

Short term memory Anthropic Messages API System prompt: You are
PyCon bot. Refuse all unrelated questions {“role”: “user”, “content”: “Tell me about hiking”} {“role”: “assistant”, “content”: “Sorry, as a PyCon bot, I cannot help with that”} {“role”: “user”, “content”: “Tell me about this years PyCon”} {“role”: “assistant”, “content”: “PyCon US is happening at Long Beach, CA this year. …”}

The Agent Loop User sends a message -> Agent receives
the message -> Agent adds it to the conversation context -> Agent sends context + system prompt to the LLM -> LLM decides what to do, performs intermediate steps -> LLM generates ﬁnal response -> Agent sends response to user -> Wait for next instruction from user

- Linus Torvalds Talk is cheap. Show me the code.

Tools External capabilities the agent can use • Tools let
an AI agent do things beyond just generating text. • Used to take actions or get information • Examples: search the web, call an API, read ﬁles, run code, send emails, or query a database. • Chosen by the agent when needed.

MCP • Model Context Protocol: USB-C for AI integrations •
Eg: Github MCP, Figma MCP, etc. • Model can discover supported functionalities and can call them when needed.

MCP Inspector - GitHub MCP

Skills Reusable instructions or workﬂows • Skills tell an agent
how to perform a speciﬁc type of task and guide the agent’s behavior • A skill can include steps, rules, examples, formats, or best practices. • Useful for repeated tasks Examples: writing reports, analyzing PDFs, creating slides, debugging code, or handling customer support. Make agents more specialized

Skills

AI Agent Architecture ~ Backend Architecture Source: https://blog.anirudha.dev/the-missing-playbook-for-ai-agents/

Building the SDK

Why Build an SDK • Abstract away the complexity from
end users • Easier to switch across providers Similar to ORM vs Raw SQL queries debate • Learn the internals

Demos • Agent loop • Supporting multiple providers: OpenAI, Anthropic,
Ollama • Tool calling • MCP • Skills Building an AI Agent SDK

The Agent Capability Stack System Prompt Skills Tools MCP Global
behavior and constraints Packaged procedures (+ code + assets) Do something in the world Standard way to connect models to external tools and context Use for: • Safety boundaries, tone, refusal style • “Always do X” principles that apply every turn • Small, stable policies Use a skill when you want the model to: • Follow a repeatable workﬂow • Use scripts/templates • Do it sometimes, not always Use tools when the model must: • Call external services or databases • Create side effects (tasks outside of the environment, like canceling an order or sending an email) • Fetch live state Use MCP when you want to: • Expose tools/resources through a standard protocol • Connect agents to ﬁles, APIs, databases, IDEs, browsers, etc. • Reuse integrations across different models/clients

Example: Coding Agent System Prompt Skills Tools MCP Global behavior
and constraints Packaged procedures (+ code + assets) Do something in the world Standard way to connect models to external tools and context You are a Python coding agent built to demonstrate `babyagent` SDK. Only use Python for the backend logic. For UI, use Streamlit - no other frameworks allowed. After writing code, use linter and type checker to verify. • FastAPI best practices • Streamlit UI guidelines • Linter • Static type checker • Run and capture logs • Web search & fetch: To read documentation • GitHub MCP: Push code, create PRs • Playwright MCP: Load the app in browser and take screenshots

• Context compaction • Multi-agents • Evals: Benchmarks • Observability:
Logging • Errors: Rate limits, etc. Next Steps • Retries • Structured output • File uploads • Other provider speciﬁc customization options (like Built in tools) • Permissions - human in the loop

Summary • Started with Agent loop - continuous chat with
an LLM preserving the conversation history in context (short-term memory). • Added tools: Custom built utilities that agents can use. Added support across OpenAI, Anthropic and Ollama • Added remote MCP support: Native support for Anthropic & OpenAI; MCP to tool adaptor for Ollama • Added skills support: Native support for OpenAI; Support via system prompt to Anthropic and Ollama https://github.com/serpapi/babyagent

References Kumar Anirudha: The Missing Playbook for AI Agents Niyas
Mohammed: The Anatomy of an AI Coding Harness Sebastian Raschka: Components of A Coding Agent OpenAI: Agent guide Anthropic: SDK Docs & Blog Standards: modelcontextprotocol.io, skill.md

SerpApi Resources Sign-up at https://serpapi.com/ • API Playground: https://serpapi.com/playground •
Python SDK: https://github.com/serpapi/serpapi-python • MCP: https://github.com/serpapi/serpapi-mcp • Skill: https://github.com/serpapi/skills • CLI: https://github.com/serpapi/serpapi-cli • Point your agent to: https://serpapi.com/llms.txt Join Us: https://serpapi.com/careers

SerpApi Resources github.com/serpapi/pycon-us

Thank You! Connect [email protected] /in/adarsh-d github.com/serpapi/pycon-us

Inside an AI Agent SDK: Building One from Scrat...

Inside an AI Agent SDK: Building One from Scratch in Python

More Decks by Adarsh D

Other Decks in Programming

Featured

Transcript