Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rescue Agents from Prototype Purgatory: Operati...

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.

Rescue Agents from Prototype Purgatory: Operationalize Agent Readiness

2025 will be remembered as the year enterprises experimented with AI agents. By 2026, experimentation will no longer be enough. As agents begin influencing real workflows, decisions, and outcomes, organizations face a new challenge: defining when an agent is actually “ready” to operate. This session explores how Agent Readiness pairs qualitative standards with enforceable control points, enabling teams to rescue agents from infinite prototype loops while building and maintaining trust, consistency, and accountability. We'll explore how readiness reframes the software development lifecycle, and how changing how you work with AI is as important as what you decide to do with AI.

Avatar for Ignasi Barrera

Ignasi Barrera

March 23, 2026
Tweet

More Decks by Ignasi Barrera

Other Decks in Technology

Transcript

  1. Agent Demos Not Translating To Outcomes Business Trust in AI

    Falls AI Mistakes & Misbehavior Declare Demo Success Prematurely Businesses want proof: • Clear Standards For Agent Readiness • Predictable Behavior • Confidence Agents Are Ready The human work to supervise agents can’t be more than the work offloaded. Need automated oversight for agents. Escalating Human Correction How AI projects stall: Agents run in production from day one, but few become “complete.” Every team builds differently, no shared standard exists, and nothing defines when an agent is actually done. Without a finish line, AI stays in prototype/pilot mode and value stalls
  2. AI introduces critical external dependencies. Oversight must absorb that volatility

    without breaking operations. Your agents don’t break when models change or fail AI creates new paths for data exposure that require enforcement at runtime, not just policy at design time. Your data stays where you intend Exact outputs can’t be guaranteed, but acceptable behavior can be defined, measured, and enforced over time. Your agents behave within approved bounds AI introduces critical external dependencies. Oversight must absorb that volatility without breaking operations. The 3 Guarantees Of Automated Oversight
  3. Engineers Need A Control Point Within The Network To Apply

    Runtime Governance Governance at Day 0 Runtime enforcement is the prerequisite — without it, there is nothing to govern. Agents cannot launch safely without it. Continuous vs Periodic GRC tools check compliance once daily at best. Runtime enforcement is truly continuous — per-request, at scale GRC Integration Simple REST/HTTP API connects Runtime Layer to GRC tools for compliance status reporting. Evidence flows up automatically. Owns the Runtime Enforcement Layer — the critical control point the entire stack depends on. LLMs → MCP Tools Application Logic → → → AGENT What needs to be managed: ENVOY
  4. Envoy Already Runs The World’s Application Traffic Top 20 Companies

    Running Envoy at Scale LYFT TRAFFIC Millions RPS PINTEREST MAU 250M+ BITBUCKET RPD Billions INSTANCE CAPACITY 2.3B RPD CONFIRMED PRODUCTION USERS Lyft Edge + Mesh Pinterest Edge Bitbucket Edge Google Mesh Netflix Mesh Airbnb Mesh Uber Mesh Apple Mesh Microsoft Mesh Amazon App Mesh Booking.com Mesh eBay Mesh Salesforce Mesh Stripe Mesh Square Mesh Twilio Mesh Verizon Mesh Tencent Mesh IBM Mesh Medium Mesh RPS = Requests Per Second | RPD = Requests Per Day | MAU = Monthly Active Users | Note: Most companies treat traffic volumes as confidential
  5. What Technical Readiness Actually Looks Like Metric Pilot Standard Production

    Standard How to Get There Accuracy "Pretty good" <2% hallucination rate LLM-as-a-Judge + guardrail validation Explainability "We can check logs" Complete audit trail Gateway + structured logging Availability "Works most of the time" 99.9% uptime Multi-model routing with failover Security "We're being careful" Pass penetration testing AI Firewall (inbound + outbound) & guardrails Cost "We'll see..." ±10% of budget Rate limiting + usage monitoring + quotas Compliance "We think it's okay" Pass regulatory review FINOS framework implementation