[EuRuKo 2024] Intro to AI Agents

Slide 1

Slide 1 text

Intro to AI Agents ƻ EuRuKo 2024 Friday, September 13th, 2024 by Andrei Bondarev

Slide 2

Slide 2 text

Work Source Labs LLC Software-development firm Clients: VC-backed startups and Enterprises We <3 Rails Patterns AI Applied AI research organization Open-source work We <3 Ruby

Slide 3

Slide 3 text

GenAI Impact Before: 1 month Label data 3 months Train custom model 3 months Deploy (optimize) After: Few days Prompt engineering Few weeks Basic RAG (if needed) Few days Deploy

Slide 4

Slide 4 text

Common ML tasks Classification Named Entity Recognition Summarization Translation

Slide 5

Slide 5 text

Capabilities, an API call away Adoption Cost

Slide 6

Slide 6 text

(Re-)Rise of AI Agents 1950s 1970s — 1980s 1990s — 2000s Intelligent Machines Expert Systems 2010s Software Agents 2020s Chatbots LLMs as Agents

Slide 7

Slide 7 text

The Vision

Slide 8

Slide 8 text

AI Agent ƻ Definition: An autonomous software system capable of perceiving its environment, making decisions, and taking actions to achieve specific goals. ♻ Environment awareness 2 Decision-making Ƣ Action-taking

Slide 9

Slide 9 text

Agent vs Assistant Assistant Conversational system that continuously takes directions from a human Agent Autonomous system that independently executes a task (like a background job)

Slide 10

Slide 10 text

Agent vs Assistant Conversational Assistant Free-for-all input from user Autonomous Agent Guided-input from user

Slide 11

Slide 11 text

Use-cases Automating business processes Mundane low-IQ tasks Personal assistant (co-pilot) Tasks in a consulting business: Creating invoices from timesheets Categorizing business expenses Writing project proposals (incl. service offering, meeting notes) Writing job descriptions. Writing JIRA tickets.

Slide 12

Slide 12 text

Components of an AI agent 1. Planning & Reasoning 2. Role Playing 3. Environment Perception 4. Tool Calling 5. Memory 6. Evaluations (Evals)

Slide 13

Slide 13 text

Planning Plan formulation Decomposing a top-level task into numerous sub- tasks. Plan reflection Leveraging feedback mechanism to reflect upon a plan and evaluate its merits.

Slide 14

Slide 14 text

Reasoning Cornerstone for problem-solving, decision-making and critical analysis. Deductive, inductive, abductive are the primary forms of reasoning. Reasoning capacity is crucial for solving complex tasks.

Slide 15

Slide 15 text

Chain-of-Thought (CoT) Forcing the AI to explain it's reasoning. Without Chain-of-Thought prompting With Chain-of-Thought prompting

Slide 16

Slide 16 text

Role playing Forcing the AI to adopt certain personality, character, and behavior, via prompt engineering Ɗ Strict Manager % Relaxed Manager Dungeon Master Ƽ Helpful AI assistant

Slide 17

Slide 17 text

Environment perception "Today is September 13, 2024 "

Slide 18

Slide 18 text

Tool/Function Calling Structured Outputs Response adhere to a predefined JSON schema External Tools Intent detection

Slide 19

Slide 19 text

Tool Calling Use tools to do the following: Get data from external sources (APIs) Get real-time data Take actions Execute deterministic tasks1 Without Tools Using the Tool (Code Interpreter)

Slide 20

Slide 20 text

Tool Calling

Slide 21

Slide 21 text

Memory ("remembering") Saving the environment, progress, tool calling to memory

Slide 22

Slide 22 text

Memory => Retrieval Augmented Generation (RAG)

Slide 23

Slide 23 text

Common problems ♻ Hallucinations 2 Poor reasoning Ƣ Unreliable tool calling

Slide 24

Slide 24 text

Evaluations Benchmarks Comparing to a large dataset of question-answer pairs. "LLM as a Judge" Asking LLM whether the answer fits a list of criteria.

Slide 25

Slide 25 text

Benchmarks huggingface gretelai/gsm8k-synthetic-diverse-405b · Datasets at Hugging Face We ʼ re on a journey to advance and democratize artificial intelligence through open source and open science .

Slide 26

Slide 26 text

Reasoning (Next Frontier) Training models specifically on reasoning data but… No good training data

Slide 27

Slide 27 text

langchainrb ⭐ Ruby framework for building LLM-powered applications

Slide 28

Slide 28 text

Demo

Slide 29

Slide 29 text

Nerds & Threads Selling comfortable nerdy t-shirts for software engineers that work from home Services ú Customer Management ✉ Email Service Payment Gateway Service Order Management Inventory Management Shipping Service

Slide 30

Slide 30 text

Diagram

Slide 31

Slide 31 text

Code

Slide 32

Slide 32 text

Demo AI Assistant Chat

Slide 33

Slide 33 text

Why would you use this? Changing requirements on the fly Text-to-SQL using the Database tool

Slide 34

Slide 34 text

Recap

Slide 35

Slide 35 text

Thank you! ɉ @rushing_andrei @andreibondarev in/andreibondarev [email protected] Discord

Slide 36

Slide 36 text

References "Tool Use with Open-Source LLMs" by Rick Lamers (Groq) 1. https://arxiv.org/abs/2309.07864 2. https://www.promptingguide.ai/techniques/cot 3. https://docs.google.com/presentation/d/1EH11pXLanLMoLGEXd_YduYxEEk3X2Y63KOOFsysq1LY 4. https://www.ibm.com/think/topics/ai-agents 5. Berkeley Function-Calling Leaderboard 6. https://garymarcus.substack.com/p/math-is-hard-if-you-are-an-llm-and 7. https://obie.medium.com/my-kids-and-i-just-played-d-d-with-chatgpt4-as-the-dm-43258e72b2c6 8.

Slide 37

Slide 37 text

langchain.rb contributors