Building AI Agents
in Ruby
Rocky Mountain Ruby 2024
Tuesday, October 8th, 2024
by Andrei Bondarev
Slide 2
Slide 2 text
Work
Source Labs LLC Patterns AI
Slide 3
Slide 3 text
My Impact
Slide 4
Slide 4 text
GenAI Impact
Before:
1 month
Label data
3 months
Train custom model
3 months
Deploy (optimize)
After:
Few days
Prompt engineering
Few weeks
Basic RAG (if needed)
Few days
Deploy
Slide 5
Slide 5 text
Common ML tasks
Data Structuring Summarization Classification
Language Translation Content Generation Named Entity
Recognition
Slide 6
Slide 6 text
Capabilities, an API call away
Adoption
Cost
Slide 7
Slide 7 text
AI in every stack
Slide 8
Slide 8 text
AI Agents in every Enterprise
Slide 9
Slide 9 text
(Re-)Rise of AI Agents
1950s 1970s — 1980s 1990s — 2000s
Intelligent Machines
Expert Systems
2010s
Software Agents
2020s
Chatbots
LLMs as Agents
Slide 10
Slide 10 text
The Vision
Slide 11
Slide 11 text
AI Agent
ƻ
Definition: An autonomous software system capable of perceiving its environment, making decisions, and
taking actions to achieve specific goals.
♻ Environment
awareness
2 Decision-making
Ƣ Action-taking
Slide 12
Slide 12 text
Agent vs Assistant
Conversational Assistant
Conversational system that continuously takes
directions from a human
Autonomous Agent
Autonomous system that independently executes a
task (like a background job)
Slide 13
Slide 13 text
Use-cases
Automating business processes
Mundane low-IQ tasks Personal assistant (co-
pilot)
Time-consuming tasks
Tasks in a consulting business:
Creating invoices
from timesheets
Categorizing
business expenses
Writing project
proposals (incl.
service offering,
meeting notes)
Writing job
descriptions.
Writing JIRA tickets.
Reasoning & Planning
Cornerstone for problem-solving, decision-making and critical analysis.
Primary forms of reasoning
Deductive — drawing a specific conclusion from general facts.
Inductive — making a broad generalization from specific observations
Abductive — finding the simplest explanation for an observation
Plan formulation
Decomposing a top-level task into numerous sub-
tasks.
Plan reflection
Leveraging feedback mechanism to reflect upon a
plan and evaluate its merits.
Slide 16
Slide 16 text
Chain-of-Thought (CoT)
Paper: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (2022)
Forcing the AI to explain it's reasoning.
Without Chain-of-Thought prompting
With Chain-of-Thought prompting
Slide 17
Slide 17 text
Business logic
Tasks/Goals/Objectives/Workflows/"Standard Operating Procedures"
Standard Operating Procedures in e-commerce.
New Order Return Order
Slide 18
Slide 18 text
Triggers
State change Schedule
⏰ Event-driven
"6 Manual
▶
Slide 19
Slide 19 text
Memory
Saving the context, execution progress, tool calling to memory
Slide 20
Slide 20 text
Retrieval Augmented Generation (RAG)
Slide 21
Slide 21 text
Tool/Function Calling
Structured Outputs
Response adhere to a predefined JSON schema
External Tools
Intent detection
Slide 22
Slide 22 text
Tool Calling
Use tools to do the following:
Get data from external sources
(APIs)
Get real-time data
Take actions Execute deterministic tasks1
Without Tools
Using the Tool (Code Interpreter)
Slide 23
Slide 23 text
Tool Calling
Function definition User's message
Function invocation
Slide 24
Slide 24 text
AI Agent diagram
LLMs
AI Agent
Tools
Triggers
⏰
Instructions
Memory
User
Store/Retriever
Take Actions
Reason/Plan
Business logic
Converse
Slide 25
Slide 25 text
langchainrb
Ruby framework for building LLM-powered applications
Slide 26
Slide 26 text
Demo
Slide 27
Slide 27 text
Nerds & Threads
Selling comfortable nerdy t-shirts for software engineers that work from home
AI Agent
ú Customer
Management
✉ Email Service
Payment
Gateway Service
Order
Management
Inventory
Management
Shipping Service
Slide 28
Slide 28 text
Business logic (in code)
The Ruby on Rails promise:
"Developers focus on writing business logic and not the
'plumbing'"
Old World (before AI)
Business logic in models and service
objects
New World (after AI)
Business logic in prompts
Slide 29
Slide 29 text
Code
Slide 30
Slide 30 text
Tool Definitions
Slide 31
Slide 31 text
Demo
Slide 32
Slide 32 text
Text-to-SQL
Slide 33
Slide 33 text
Why would you use this?
Changing requirements on the fly
Intelligence in your process
Tackling complex workflows
Slide 34
Slide 34 text
Evaluations
Benchmarks
Comparing to a large dataset of question-answer
pairs.
"LLM as a Judge"
Asking LLM whether the answer fits a list of criteria.
Slide 35
Slide 35 text
Benchmarks
huggingface
gretelai/gsm8k-synthetic-diverse-405b · Datasets at Hugging Face
We
ʼ
re on a journey to advance and democratize artificial intelligence through
open source and open science
.
Slide 36
Slide 36 text
Agent Reliability
Responsibilities
# of Tasks
Decision Tree
SIMPLER
COMPLEX
INCREASES
Reliability
DECREASES
RELIABLE
UNREALIABLE
Slide 37
Slide 37 text
System reliability
Modern software fails because: AI systems fail because:
Dependencies Inaccurate or incomplete data / Bias in data
Doesn't scale Compute limits
Cloud outages Cloud outages
Cyber attacks Adversarial attacks
Insufficient testing (bugs) Black box behavior
Unclear liability & accountability
Engineering problems that will be solved.