Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[EuRuKo 2024] Intro to AI Agents

[EuRuKo 2024] Intro to AI Agents

The author of Langchain.rb will walk you through current capabilities of LLMs and what can be built today. We will build a business process automation AI agent in Ruby and discuss the common pitfalls and misconceptions. We’ll discuss what might be emerging as a new LLM-powered software stack.

Generative AI has been taking the world by storm. The Coatue AI (Nov 2023) report is putting AI models at the centerpiece of all modern tech stacks going forward that Application Developers will be using to build on top of. It would not be controversial to say that the Ruby ecosystem lacks in its support and understanding of the AI, ML and DS landscape. If we’d like to stay relevant in the future, we need to start building the foundations now. We’ll look at what Generative AI is, what kind of applications developers in other communities are building and how Ruby can be used to build similar applications today. We’ll cover Retrieval Augmented Generation (RAG), vector embeddings and semantic search, prompt engineering, and what the state of art (SOTA) in evaluating LLM output looks like today. We will also cover AI Agents, semi-autonomous general purpose LLM-backed applications, and what they’re capable of today. We’ll make a case why Ruby is a great language to build these applications because on its strengths and its incredible ecosystem. After the slides, I’ll walk the attendees through building an AI Agent in 15 min with Langchain.rb.

Andrei Bondarev

September 13, 2024
Tweet

More Decks by Andrei Bondarev

Other Decks in Technology

Transcript

  1. Work Source Labs LLC Software-development firm Clients: VC-backed startups and

    Enterprises We <3 Rails Patterns AI Applied AI research organization Open-source work We <3 Ruby
  2. GenAI Impact Before: 1 month Label data 3 months Train

    custom model 3 months Deploy (optimize) After: Few days Prompt engineering Few weeks Basic RAG (if needed) Few days Deploy
  3. (Re-)Rise of AI Agents 1950s 1970s — 1980s 1990s —

    2000s Intelligent Machines Expert Systems 2010s Software Agents 2020s Chatbots LLMs as Agents
  4. AI Agent ƻ Definition: An autonomous software system capable of

    perceiving its environment, making decisions, and taking actions to achieve specific goals. ♻ Environment awareness 2 Decision-making Ƣ Action-taking
  5. Agent vs Assistant Assistant Conversational system that continuously takes directions

    from a human Agent Autonomous system that independently executes a task (like a background job)
  6. Use-cases Automating business processes Mundane low-IQ tasks Personal assistant (co-pilot)

    Tasks in a consulting business: Creating invoices from timesheets Categorizing business expenses Writing project proposals (incl. service offering, meeting notes) Writing job descriptions. Writing JIRA tickets.
  7. Components of an AI agent 1. Planning & Reasoning 2.

    Role Playing 3. Environment Perception 4. Tool Calling 5. Memory 6. Evaluations (Evals)
  8. Planning Plan formulation Decomposing a top-level task into numerous sub-

    tasks. Plan reflection Leveraging feedback mechanism to reflect upon a plan and evaluate its merits.
  9. Reasoning Cornerstone for problem-solving, decision-making and critical analysis. Deductive, inductive,

    abductive are the primary forms of reasoning. Reasoning capacity is crucial for solving complex tasks.
  10. Chain-of-Thought (CoT) Forcing the AI to explain it's reasoning. Without

    Chain-of-Thought prompting With Chain-of-Thought prompting
  11. Role playing Forcing the AI to adopt certain personality, character,

    and behavior, via prompt engineering Ɗ Strict Manager % Relaxed Manager Dungeon Master Ƽ Helpful AI assistant
  12. Tool Calling Use tools to do the following: Get data

    from external sources (APIs) Get real-time data Take actions Execute deterministic tasks1 Without Tools Using the Tool (Code Interpreter)
  13. Evaluations Benchmarks Comparing to a large dataset of question-answer pairs.

    "LLM as a Judge" Asking LLM whether the answer fits a list of criteria.
  14. Benchmarks huggingface gretelai/gsm8k-synthetic-diverse-405b · Datasets at Hugging Face We ʼ

    re on a journey to advance and democratize artificial intelligence through open source and open science .
  15. Nerds & Threads Selling comfortable nerdy t-shirts for software engineers

    that work from home Services ú Customer Management ✉ Email Service  Payment Gateway Service Order Management Inventory Management Shipping Service
  16. Why would you use this? Changing requirements on the fly

    Text-to-SQL using the Database tool
  17. References "Tool Use with Open-Source LLMs" by Rick Lamers (Groq)

    1. https://arxiv.org/abs/2309.07864 2. https://www.promptingguide.ai/techniques/cot 3. https://docs.google.com/presentation/d/1EH11pXLanLMoLGEXd_YduYxEEk3X2Y63KOOFsysq1LY 4. https://www.ibm.com/think/topics/ai-agents 5. Berkeley Function-Calling Leaderboard 6. https://garymarcus.substack.com/p/math-is-hard-if-you-are-an-llm-and 7. https://obie.medium.com/my-kids-and-i-just-played-d-d-with-chatgpt4-as-the-dm-43258e72b2c6 8.