Slide 1

Slide 1 text

Code District, Lahore Disciplined Vibes: Scaling AI-Assisted Engineering Sheharyar Naseer ◦ June 2026

Slide 2

Slide 2 text

Sheharyar Naseer Systems architect & technology advisor for startups and enterprises. Find me online @sheharyarn

Slide 3

Slide 3 text

Background ✦ Principal Software Architect at Infra One ✦ Worked with: Apple, Slab, TheScore, Superlist, etc. ✦ 16+ years of polyglot experience, focus on Web & Cloud ✦ StackOver fl ow: 75,000+ score (Top 5 in Pakistan) ✦ Author / Contributor of multiple famous libraries & tools ✦ Featured on popular developer communities

Slide 4

Slide 4 text

Outline PART 1 The Problem PART 2 It's Not the Model PART 3 Harness Engineering PART 4 Live Workshop OUTRO What's Next?

Slide 5

Slide 5 text

01 The Problem Why you're struggling with AI-assisted coding and the data backs it up.

Slide 6

Slide 6 text

Struggling with AI ✦ 16+ years of experience, still humbled by a chatbot ✦ Struggled a lot with AI-assisted coding ✦ Code quality was extremely poor ✦ Often had to spent time fi xing it ✦ Or throwing it away and doing manually

Slide 7

Slide 7 text

AI-Assisted Problems ✦ Hallucinated APIs, function calls, and packages ✦ Insecure code ✦ Architectural drift ✦ Ignored edge-cases ✦ Incorrect, or no error-handling ✦ Performance issues ✦ So many more...

Slide 8

Slide 8 text

The Data Agrees METR METR's randomized controlled trial found experienced developers were 19% slower with early-2025 AI. SOURCE DORA Google's DORA 2024 research found AI adoption reduced delivery stability, continuing into 2025 despite higher adoption & throughput. SOURCE

Slide 9

Slide 9 text

“Seniors often get worse results than juniors from same tools until they learn deliberate prompting. But once they do they have a massive advantage. Sabrina Goldfarb SWE at Github Co-Pilot

Slide 10

Slide 10 text

02 It's Not The Model Exploring the root causes and developing the right thinking model.

Slide 11

Slide 11 text

It's a You Problem ✦ Don't understand how LLMs work ✦ Gold fi sh memory & context management ✦ Incomplete specs ✦ Basic prompts ✦ Missing documentation & examples ✦ Unreliable guardrails ✦ No systems or quality checks ✦ Agents don't receive feedback about what's wrong

Slide 12

Slide 12 text

Mental Models AI Search Shallow use of modern LLMs as a Google replacement Vibe Coding Fully delegating code to AI without reviewing output Vibe Engineering Accelerating professional software engineering with AI YOU ARE HERE

Slide 13

Slide 13 text

Vibe Engineering ✦ Does not mean better prompts ✦ Foundation/architecture/system where the agent can "succeed" ✦ Feedback loops ✦ Also called Evaluation Driven Development (EDD)

Slide 14

Slide 14 text

“You shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents. Peter Steinberger Creator of OpenClaw, Technical Staff at OpenAI

Slide 15

Slide 15 text

03 Harness Engineering The scaffold is the product.

Slide 16

Slide 16 text

✦ LangChain research team describe it as: Agent = Model + Harness ✦ "Everything other than the model" ✦ Prompt, Evals, Tool Calls, Docs, Context, etc. ✦ Even the GUI/CLI "agent" tool you use What's a Harness? “Agent = Model + Harness Vivek Trivedi (Researcher, LangChain)

Slide 17

Slide 17 text

✦ SWE bench score improvements ✦ 42% → 78%, 46% → 80%, 23% → 45% ✦ ~22 point swings vs ~1 point swings ✦ Using frontier models The Model Doesn't Matter SAME HARNESS Different Model SAME MODEL Scaffold Changes ~1 ~22 POINT SWINGS POINT SWINGS

Slide 18

Slide 18 text

✦ Inner Harness (System) ✦ Built into your coding agent (CLI/GUI tool) ✦ System prompt, Tool calls, Orchestration ✦ Outer Harness (User) ✦ Controls put in place by users ✦ User prompt, Agent rules, Output validation ✦ Our focus today Anatomy of a Harness MODEL INNER HARNESS OUTER HARNESS

Slide 19

Slide 19 text

Types of Harness Feedforward Guides Feedback Sensors BEHAVIOUR MAINTAINABILITY ARCHITECTURE DIRECTION DOMAIN INFERENTIAL DETERMINISTIC NATURE

Slide 20

Slide 20 text

Types of Harness Feedforward Guides Feedback Sensors BEHAVIOUR MAINTAINABILITY ARCHITECTURE DIRECTION DOMAIN INFERENTIAL DETERMINISTIC NATURE

Slide 21

Slide 21 text

Types of Harness Feedforward Guides Feedback Sensors BEHAVIOUR MAINTAINABILITY ARCHITECTURE DIRECTION DOMAIN INFERENTIAL DETERMINISTIC NATURE

Slide 22

Slide 22 text

Types of Harness Feedforward Guides Feedback Sensors BEHAVIOUR MAINTAINABILITY ARCHITECTURE DIRECTION DOMAIN INFERENTIAL DETERMINISTIC NATURE

Slide 23

Slide 23 text

Types of Harness Feedforward Guides Feedback Sensors BEHAVIOUR MAINTAINABILITY ARCHITECTURE DIRECTION DOMAIN INFERENTIAL DETERMINISTIC NATURE

Slide 24

Slide 24 text

 HUMAN ✦ AGENT PROMPTS AGENTS.MD SPECS, PRD & ADR STYLEGUIDES REFERENCE DOCS RULES SCRIPTS / CLI TOOLS CODEMODS LANGUAGE SERVERS ... UNIT TESTS E2E TESTS STATIC ANALYSIS REVIEW AGENTS LOGS BROWSER LINTERS SBOM VALIDATION SECURITY SCANNERS ... Feedforward Guides Feedback Sensors INITIAL GENERATION SELF-CORRECTING

Slide 25

Slide 25 text

✦ Write actual documentation ✦ Guides, rules, conventions; plus examples ✦ Current architecture overview ✦ Long-term specs, PRDs, and ADRs ✦ Add helpful tooling ✦ Code generation scripts, tools, helpers ✦ Language servers ✦ Entrypoint is the "router" Implementing Guides my_app ├── AGENTS.md ├── docs/ │ ├── rules/ │ ├── guides/ │ ├── adrs/ │ └── specs/ ├── . . . └── . . .

Slide 26

Slide 26 text

✦ More important than Guides ✦ For maintainability and architectural quality ✦ Focus on Deterministic controls fi rst ✦ Fast, reliable, cheap ✦ Implementation Layers ✦ Fastest & accurate feedback early ✦ Goal: Push agents' reliable coverage as far up as possible Implementing Sensors 1. LINTING & STATIC CHECKS 2. UNIT TESTS 3. INTEGRATION/ E2E 4. AI REVIEWS 5. MANUAL QA IMPLEMENTATION LAYERS

Slide 27

Slide 27 text

04 Live Workshop It's your turn. Let's build our own outer harness.

Slide 28

Slide 28 text

Pick an Idea Lets' build the App and a Harness

Slide 29

Slide 29 text

05 What's Next? Improving harnesses and building repeatable systems.

Slide 30

Slide 30 text

✦ Establish Discipline ✦ Capture standard conventions, security mandates, architecture patterns ✦ Keep AI out of writing tests, preserve double-bookkeeping ✦ Build Reusable Harnesses ✦ CI templates with common deterministic checks ✦ Inferential review agents for security, architecture, gap analysis, even PR reviews ✦ Scale via Service Templates ✦ Service-level AGENTS.md Recommendations

Slide 31

Slide 31 text

✦ Enterprises & agencies have pre-de fi ned service templates ✦ Internal team guides ✦ Codemods & internal tools ✦ Boilerplate projects ✦ Embed harnesses directly in them ✦ Scaffold not just code, but AI knowledge and conventions from day one ✦ Inter-organization review agents Service Templates

Slide 32

Slide 32 text

✦ Custom skills and slash commands ✦ Subagents for sub-tasks for context optimization ✦ Agent Councils & Consensus ✦ Adverserial reviews with multiple agents deciding on next steps ✦ Parallel agent execution with git worktrees ✦ Multiply output using same harness ✦ Independently running agent loops ✦ Spec → Code → PR → Review → Address Feedback → Merge Advanced Workflows

Slide 33

Slide 33 text

Questions? Further Reading → martinfowler.com/articles/harness-engineering.html These Slides → shyr.io/t/disciplined-vibes More Talks → shyr.io/talks shyr.io [email protected] @sheharyarn   