with: Apple, Slab, TheScore, Superlist, etc. ✦ 16+ years of polyglot experience, focus on Web & Cloud ✦ StackOver fl ow: 75,000+ score (Top 5 in Pakistan) ✦ Author / Contributor of multiple famous libraries & tools ✦ Featured on popular developer communities
by a chatbot ✦ Struggled a lot with AI-assisted coding ✦ Code quality was extremely poor ✦ Often had to spent time fi xing it ✦ Or throwing it away and doing manually
developers were 19% slower with early-2025 AI. SOURCE DORA Google's DORA 2024 research found AI adoption reduced delivery stability, continuing into 2025 despite higher adoption & throughput. SOURCE
a Google replacement Vibe Coding Fully delegating code to AI without reviewing output Vibe Engineering Accelerating professional software engineering with AI YOU ARE HERE
+ Harness ✦ "Everything other than the model" ✦ Prompt, Evals, Tool Calls, Docs, Context, etc. ✦ Even the GUI/CLI "agent" tool you use What's a Harness? “Agent = Model + Harness Vivek Trivedi (Researcher, LangChain)
→ 80%, 23% → 45% ✦ ~22 point swings vs ~1 point swings ✦ Using frontier models The Model Doesn't Matter SAME HARNESS Different Model SAME MODEL Scaffold Changes ~1 ~22 POINT SWINGS POINT SWINGS
(CLI/GUI tool) ✦ System prompt, Tool calls, Orchestration ✦ Outer Harness (User) ✦ Controls put in place by users ✦ User prompt, Agent rules, Output validation ✦ Our focus today Anatomy of a Harness MODEL INNER HARNESS OUTER HARNESS
quality ✦ Focus on Deterministic controls fi rst ✦ Fast, reliable, cheap ✦ Implementation Layers ✦ Fastest & accurate feedback early ✦ Goal: Push agents' reliable coverage as far up as possible Implementing Sensors 1. LINTING & STATIC CHECKS 2. UNIT TESTS 3. INTEGRATION/ E2E 4. AI REVIEWS 5. MANUAL QA IMPLEMENTATION LAYERS
patterns ✦ Keep AI out of writing tests, preserve double-bookkeeping ✦ Build Reusable Harnesses ✦ CI templates with common deterministic checks ✦ Inferential review agents for security, architecture, gap analysis, even PR reviews ✦ Scale via Service Templates ✦ Service-level AGENTS.md Recommendations
✦ Internal team guides ✦ Codemods & internal tools ✦ Boilerplate projects ✦ Embed harnesses directly in them ✦ Scaffold not just code, but AI knowledge and conventions from day one ✦ Inter-organization review agents Service Templates