Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Inside an AI Agent SDK: Building One from Scrat...

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.

Inside an AI Agent SDK: Building One from Scratch in Python

Avatar for Adarsh D

Adarsh D

June 03, 2026

More Decks by Adarsh D

Other Decks in Programming

Transcript

  1. LLM • Large Language Model: an AI system trained on

    huge amounts of text to understand and generate human-like language. • Data include: Entire open source GitHub repositories, web pages, etc. • E.g: GPT 5.5, Claude Opus 4.7
  2. LLM • Does not have access to real time information

    (apart from the training data) by default • They are intelligent, but constrained.
  3. LLM Solution: AI Agents LLMs are capable to do more

    than just chat. AI agent is an AI system that can autonomously plan and execute multi-step actions toward a goal. Agents usually have specialized tools: Bash, Websearch
  4. Choosing an LLM - Reasoning effort A reasoning LLM break

    complex problems into smaller steps, often called “reasoning traces”. More thinking effort: More costs, Higher response times, Higher accuracy for complex tasks
  5. Choosing an LLM - Context Window Context Window: model’s working

    memory. • Latest flagship models 1M tokens ~ 750k words ~ 3000 pages ~ 15 books • GPT 4 Context Window: 128K ~ 2 books
  6. Choosing an LLM - Context Window Increased context -> More

    tokens -> Higher cost -> Slower responses User: Convert 30 celsius to fahrenheit LLM: 30 degrees Celsius is equal to 86 degrees Fahrenheit. User: Is it bearable? LLM: The short answer is: Yes, 86°F (30°C) is generally manageable … User: <attachment: image> analyse the next 2 days weather forecast image LLM: The next two days will be characterized by warm temperatures …
  7. System Prompt - The Personality A system prompt is the

    model’s top-level instruction that guides its behavior during a conversation. It can set rules such as: • what role the AI should play • what tone it should use • what it should or should not do • how it should handle safety, tools, and user requests
  8. System Prompt - Examples You are Lovable, an AI editor

    that creates and modifies web applications. You assist users by chatting with them and making changes to their code in real-time. …. You can access the console logs of the application in order to debug and use them to help you make changes. Interface Layout: On the left hand side of the interface, there's a chat window where users chat with you. …. . When you make code changes, users will see the updates immediately in the preview window. Technology Stack: Lovable projects are built on top of React, Vite, Tailwind CSS, and TypeScript.
  9. System Prompt - Examples You are Codex, a coding agent

    based on GPT-5. You and the user share one workspace, and your job is to collaborate with them until their goal is genuinely handled. … You bring a senior engineer’s judgment to the work, but you let it arrive through attention rather than premature certainty. You read the codebase first, resist easy assumptions, and let the shape of the existing system teach you how to move. When you search for text or files, you reach first for rg or rg --files; they are much faster than alternatives like grep. If rg is unavailable, you use the next best tool without fuss.
  10. Memory - Short term vs Long term • Short-term memory

    helps the agent remember what the user said earlier in the same conversation. • Lives inside the context window Long-term memory: Stores information beyond one conversation persists even after the current chat or task ends. E.g: RAG, ChatGPT Memories
  11. Short term memory OpenAI Responses API {“role”: “developer”, “content”: “You

    are PyCon bot. Refuse all unrelated questions”} {“role”: “user”, “content”: “Tell me about hiking”} {“role”: “assistant”, “content”: “Sorry, as a PyCon bot, I cannot help with that”} {“role”: “user”, “content”: “Tell me about this years PyCon”} {“role”: “assistant”, “content”: “PyCon US is happening at Long Beach, CA this year. …”}
  12. Short term memory Anthropic Messages API System prompt: You are

    PyCon bot. Refuse all unrelated questions {“role”: “user”, “content”: “Tell me about hiking”} {“role”: “assistant”, “content”: “Sorry, as a PyCon bot, I cannot help with that”} {“role”: “user”, “content”: “Tell me about this years PyCon”} {“role”: “assistant”, “content”: “PyCon US is happening at Long Beach, CA this year. …”}
  13. The Agent Loop User sends a message -> Agent receives

    the message -> Agent adds it to the conversation context -> Agent sends context + system prompt to the LLM -> LLM decides what to do, performs intermediate steps -> LLM generates final response -> Agent sends response to user -> Wait for next instruction from user
  14. Tools External capabilities the agent can use • Tools let

    an AI agent do things beyond just generating text. • Used to take actions or get information • Examples: search the web, call an API, read files, run code, send emails, or query a database. • Chosen by the agent when needed.
  15. MCP • Model Context Protocol: USB-C for AI integrations •

    Eg: Github MCP, Figma MCP, etc. • Model can discover supported functionalities and can call them when needed.
  16. MCP

  17. Skills Reusable instructions or workflows • Skills tell an agent

    how to perform a specific type of task and guide the agent’s behavior • A skill can include steps, rules, examples, formats, or best practices. • Useful for repeated tasks Examples: writing reports, analyzing PDFs, creating slides, debugging code, or handling customer support. Make agents more specialized
  18. Why Build an SDK • Abstract away the complexity from

    end users • Easier to switch across providers Similar to ORM vs Raw SQL queries debate • Learn the internals
  19. Demos • Agent loop • Supporting multiple providers: OpenAI, Anthropic,

    Ollama • Tool calling • MCP • Skills Building an AI Agent SDK
  20. The Agent Capability Stack System Prompt Skills Tools MCP Global

    behavior and constraints Packaged procedures (+ code + assets) Do something in the world Standard way to connect models to external tools and context Use for: • Safety boundaries, tone, refusal style • “Always do X” principles that apply every turn • Small, stable policies Use a skill when you want the model to: • Follow a repeatable workflow • Use scripts/templates • Do it sometimes, not always Use tools when the model must: • Call external services or databases • Create side effects (tasks outside of the environment, like canceling an order or sending an email) • Fetch live state Use MCP when you want to: • Expose tools/resources through a standard protocol • Connect agents to files, APIs, databases, IDEs, browsers, etc. • Reuse integrations across different models/clients
  21. Example: Coding Agent System Prompt Skills Tools MCP Global behavior

    and constraints Packaged procedures (+ code + assets) Do something in the world Standard way to connect models to external tools and context You are a Python coding agent built to demonstrate `babyagent` SDK. Only use Python for the backend logic. For UI, use Streamlit - no other frameworks allowed. After writing code, use linter and type checker to verify. • FastAPI best practices • Streamlit UI guidelines • Linter • Static type checker • Run and capture logs • Web search & fetch: To read documentation • GitHub MCP: Push code, create PRs • Playwright MCP: Load the app in browser and take screenshots
  22. • Context compaction • Multi-agents • Evals: Benchmarks • Observability:

    Logging • Errors: Rate limits, etc. Next Steps • Retries • Structured output • File uploads • Other provider specific customization options (like Built in tools) • Permissions - human in the loop
  23. Summary • Started with Agent loop - continuous chat with

    an LLM preserving the conversation history in context (short-term memory). • Added tools: Custom built utilities that agents can use. Added support across OpenAI, Anthropic and Ollama • Added remote MCP support: Native support for Anthropic & OpenAI; MCP to tool adaptor for Ollama • Added skills support: Native support for OpenAI; Support via system prompt to Anthropic and Ollama https://github.com/serpapi/babyagent
  24. References Kumar Anirudha: The Missing Playbook for AI Agents Niyas

    Mohammed: The Anatomy of an AI Coding Harness Sebastian Raschka: Components of A Coding Agent OpenAI: Agent guide Anthropic: SDK Docs & Blog Standards: modelcontextprotocol.io, skill.md
  25. SerpApi Resources Sign-up at https://serpapi.com/ • API Playground: https://serpapi.com/playground •

    Python SDK: https://github.com/serpapi/serpapi-python • MCP: https://github.com/serpapi/serpapi-mcp • Skill: https://github.com/serpapi/skills • CLI: https://github.com/serpapi/serpapi-cli • Point your agent to: https://serpapi.com/llms.txt Join Us: https://serpapi.com/careers