Upgrade to Pro — share decks privately, control downloads, hide ads and more …

2026 is the Year of the Harness: Harness Engine...

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

2026 is the Year of the Harness: Harness Engineering Strategies to Save AI Coding Agents from Infinite Loops

AI coding agents can perform well on short tasks, but in real development work, the longer a task becomes, the more likely they are to lose context or repeat the same fixes. This session looks at these issues not simply as limitations of model performance, but from the perspective of the environment in which the model works, state management, validation loops, and tool design.

The key idea is the “harness” around the AI model. We will introduce how to design structures that help AI models continue working without getting lost by combining mechanisms for recording progress and decisions, documentation structures that allow step-by-step access to necessary knowledge, focused execution environments that avoid excessive tool use, and automated checks that preserve architectural boundaries.

Attendees will gain perspectives on moving AI coding beyond individual trial and error and integrating it reliably into the team development process, along with practical harness design patterns they can apply in real-world work.

More Decks by LINEヤフーTech (LY Corporation Tech)

Other Decks in Technology

Transcript

  1. 1 2026.06.29 LY Corporation 2026 is the Year of the

    Harness A Harness Engineering Strategy to Save AI Coding Agent from Infinite Loop Jeongsu An / Cheehoon Lee · LINE Plus AI Hub Dev
  2. Speaker introduction Cheehoon Lee LINE Plus / AI Hub Dev

    - Applies AI across various industries and business domains - Focuses on solving domain-specific problems with practical AI solutions Jeongsu An LINE Plus / AI Hub Dev - Has worked on AI Gateway, RAG systems, AI Agent Platform, and Slack bots - Currently develops AI Friends and AI-powered service experiences
  3. Harness = a structured work path Before “Fix the failing

    tests” After “Fix the failing tests follow this work path”
  4. Common Ground Agent = Model + Harness MODEL reason ·

    generate · choose HARNESS context · tools · checks · records https://www.langchain.com/blog/the-anatomy-of-an-agent-harness
  5. The model selects; the harness shapes the choice Plugin =

    packaged harness Plugin install , share, reuse Harness task rule , PASS, context, tools, workflow, guardrails
  6. A harness is a loop, not a one-shot answer Core

    shape: closed loop “done” = evidence
  7. Where most agent harnesses fall short Four common gaps #1

    STATIC SCAFFOLDS rules are advisory; nothing is enforced at runtime #2 SOFT GATES hooks only print warnings, never block tool calls #3 SESSION DRIFT memory and progress vanish when sessions restart #4 NO SELF-IMPROVE spec and skills cannot evolve from history
  8. A sandbox runtime tightly coupled to spec & code, driven

    by the agent Layer 3 · What is a runtime harness?
  9. Every lane run lands in INDEX.jsonl; two tools turn that

    ledger into trend signals History Ledger & Regression Watch
  10. When a spec fails, improve the spec Layer 5 ·

    Self-Evolving Pipeline *ACK = Acknowledgment
  11. Recap · gap ↔ layer mapping gap our answer Static

    Scaffolds L1 · Spec & Harness Contract Soft Gates L3 · Runtime Watch L4 · Hard Gates (deny) Session Drift Session Continuity No Self-Improve L5 · Self-Evolving
  12. spec, code, and harness re-aligned at every commit Spec drift,

    eliminated Without our plugin With validate-spec-contract spec.md code _harnesses/ drift spec, code, and harness slowly fall out of sync spec.md code _harnesses/ pre-commit gate · checks every commit every commit re-aligns the three
  13. Four messages to take home Key Takeaways 01 Runtime watch

    beats static scaffolds every tool call goes through a hook 02 Gates must be hooks, not advice permissionDecision='deny' or it's a suggestion 03 Self-evolve with a human ACK gate autonomous spec patching is dangerous 04 One contract, many runtimes model is swappable, harness contract is not
  14. Seven components arranged in a closed spec→run→history→decision cycle Appendix ·

    Meta-Harness Adoption SPEC Spec Six-Pack Pattern Registry RUN Manifest v2 · metrics Context / Tooling kind Dogfood lane HISTORY Harness Report Ledger INDEX.jsonl + result.json DECISION audit-trend Pareto Lite one closed cycle: spec → run → history → decision → back to spec