Code smarter, not harder | MD DevDays 2026

Code smarter, not harder How AI coding tools boost productivity
— and where they don't. Daniel Sogl @sogldaniel Consultant @ Thinktecture

About me Daniel Sogl Consultant @ Thinktecture AG MVP —
Developer & Web Technologies Focus: Developer Productivity & Generative AI Socials: linktr.ee/daniel_sogl 2 Code smarter, not harder How AI coding tools boost productivity — and where they don't

A Quick Show of Hands Who in this room… 1.
…used an AI coding assistant this morning? 2. …let an AI agent open a PR for you this week? 3. …shipped AI-generated code you didn't fully understand? 3 Code smarter, not harder How AI coding tools boost productivity — and where they don't

5 Acts in 40 Minutes Act 1 — Where we
actually stand (May 2026) Adoption is solved. Impact is not. Act 2 — Smart along the SDLC Where AI really helps — and where it quietly hurts Act 3 — From Vibes to Specs The biggest leverage upgrade of 2026 Act 4 — From Local to Cloud The PR is the new interface Act 5 — Roles are changing — all of them Engineer · QA · PM · Designer · SRE 4 Code smarter, not harder How AI coding tools boost productivity — and where they don't

Act 1 Where we actually stand — May 2026 5
Code smarter, not harder How AI coding tools boost productivity — and where they don't

Adoption Is Solved We’re past the inflection point 84% of
developers use or plan to use AI tools Stack Overflow Developer Survey 2025 · N=49,000 90% use AI at work JetBrains AI Pulse Apr 2026 · N=11,000 73% use AI daily — up from 41% in 2025 Pragmatic Engineer Feb 2026 · N=15,000 22% already use coding agents JetBrains AI Pulse Apr 2026 Four surveys. ~100,000 developers. The adoption question is closed. The question now: did it make us smarter? 6 Code smarter, not harder How AI coding tools boost productivity — and where they don't

THE TRUST GAP 84% use AI tools at work 29%
actually trust them — down 11 points YoY We use it. We don't trust it. We use it anyway. 66% spend more time fixing "almost-right" AI code — and it's their #1 frustration. SOURCE — STACK OVERFLOW DEVELOPER SURVEY 2025 · 49,000 RESPONDENTS 7 Code smarter, not harder How AI coding tools boost productivity — and where they don't

THE PRODUCTIVITY PARADOX · METR RCT JULY 2025 −19% Experienced
OSS devs slower with AI. They felt 20% faster. 16 devs · 246 tasks · mature repos → FEBRUARY 2026 UPDATE −18% Same devs, retested. New cohort: −4% (CI: −15 to +9%). 57 devs · 143 repos · 800+ tasks METR, in their own words: "An unreliable signal of current productivity effects." 30–50% of developers refused tasks they'd have to do without AI. Selection bias killed the RCT design. SOURCE — METR.ORG · 2025-07-10 & 2026-02-24 · ARXIV:2507.09089 8 Code smarter, not harder How AI coding tools boost productivity — and where they don't

DORA’s One-Sentence Diagnosis "AI's primary role in software development is
that of an amplifier. It magnifies the strengths of high-performing organisations and the dysfunctions of struggling ones." Strong teams get stronger. Struggling teams get worse — faster. +39% first-year ROI ~8-month payback 5 → 6% change-failure rate "instability tax" ↑ Throughput ↓ Stability trade-off — not optional SOURCES — DORA 2025 STATE OF AI · DORA 2026 ROI OF AI-ASSISTED SOFTWARE DEVELOPMENT 9 Code smarter, not harder How AI coding tools boost productivity — and where they don't

Act 2 Smart along the SDLC 10 Code smarter, not
harder How AI coding tools boost productivity — and where they don't

Eight Phases. Eight Verdicts. 1. Discovery Requirements · User Stories
2. Design Architecture · ADRs 3. Implementation Greenfield · Legacy 4. Test Coverage · Mutation 5. Review AI as reviewer 6. Docs Drafting · Maintaining 7. Maintenance Refactor · Legacy 8. Ops Incident · Debug One hero number. One caveat. One smart move. 11 Code smarter, not harder How AI coding tools boost productivity — and where they don't

PHASE 1 — DISCOVERY · REQUIREMENTS · USER STORIES HERO
NUMBER 31% domain-hallucination rate in real-world LLM use up to 60% in complex domains SMART MOVE Persona prompts + domain glossary file. Treat the LLM like a new hire. Validated finding: LLMs can generate and assess user stories at scale — but only with explicit domain context. Zero-shot output is generic theatre. Sources — arXiv:2504.00513 (UStAI) · arXiv:2507.15157 · SQ Magazine LLM Hallucination Stats 2025 12 Code smarter, not harder How AI coding tools boost productivity — and where they don't

PHASE 2 — DESIGN · ARCHITECTURE · ADRS HERO NUMBER
980 ADRs across 109 GitHub repos analyzed by LLMs violations detected — cross-decision reasoning weak SMART MOVE Feed your own ADR history into the context window. Path-dependency is everything in architecture. LLM as sounding board: excellent for pattern matching, options analysis, trade-off enumeration. Weak at "why did we already decide against this". Sources — arXiv:2602.07609 · arXiv:2604.03826 (EASE 2026) · arXiv:2504.08207 (DRAFT) 13 Code smarter, not harder How AI coding tools boost productivity — and where they don't

PHASE 3 — IMPLEMENTATION · THE PRODUCTIVITY GAP GREENFIELD ·
SIMPLE TASKS 35–40% productivity gain new projects · clean slate · isolated tasks COMPLEX LEGACY CODE ≤10% gain — or negative existing systems · the 90% of real work AI is up to 4× more productive on greenfield than on code you maintain. Most of us live on the right side of this slide. Smart move: spec-driven + small batches. Big legacy refactors with AI are an anti-pattern. We come back to this in Act 3. Sources — Stanford SEP (in DORA 2026) · METR 2025 RCT 14 Code smarter, not harder How AI coding tools boost productivity — and where they don't

PHASE 4 — TEST · AI MATCHES CODE, NOT REQUIREMENTS
HERO NUMBERS 30 → 90% line coverage in 5 min → zero bugs found tests match the code that already exists FIELD NOTE "We hit 99% coverage with AI" — and zero bugs found. The AI tested what the code does, not what it should do. Smart move: BDD-first. Tests assert the requirement, not the implementation. AI proves business intent. You review the technical "how". Sources — eferro Nov 2025 · Nimble Approach Dec 2025 · field experience 15 Code smarter, not harder How AI coding tools boost productivity — and where they don't

PHASE 5 — CODE REVIEW · AI AS REVIEWER DETECTION
49% CodeRabbit precision · 46% runtime-bug detection vs. <20% for classic static analyzers TURNAROUND 9.6 → 2.4 days Duolingo PR median · −67% · +70% PR volume FIELD NOTE AI as pre-reviewer: catches the obvious, frees humans for taste & architecture. The catch: AI-generated code produces 1.7× more issues than human code (CodeRabbit, N=470 OSS PRs). → More AI authoring ⇒ more AI reviewing. Smart move: AI catches first pass, you decide what ships. Sources — CodeRabbit Martian Benchmark 2025 · CodeRabbit "State of AI vs Human Code" Dec 2025 · GitHub Docs 16 Code smarter, not harder How AI coding tools boost productivity — and where they don't

PHASE 6 — DOCUMENTATION ADOPTION SURGE 19% → 35% teams
using AI as primary doc tool nearly doubled in one year REALITY CHECK Synthetic benchmarks: 84–89% correct. Real-world classes: 25–34% correct. Smart move: docs as versioned spec artefacts, not afterthought. AI drafts, humans validate against intent — same loop as code. Sources — State of Docs 2026 · arXiv:2510.26130 (Real-world Class-Level Code Doc) 17 Code smarter, not harder How AI coding tools boost productivity — and where they don't

PHASE 7 — MAINTENANCE · THE COMPOUND INTEREST PROBLEM ×8
code-clone blocks YoY ≥ 5-line duplicates 25 → <10% refactored-code share of all changes cut by 60% since 2021 5.5 → 7.9% code churn within 2 weeks code revised right after merge FIELD NOTE · WPF → WEB MIGRATIONS Run the agent blind on a legacy migration and it translates the old debt into the new architecture — same anti-patterns, new stack. Smart move: name the weaknesses first. Anti-patterns in instructions · domain rules in skills · self- healing hooks to catch regressions. Proven across multiple client engagements — WPF desktop → modern web apps. Source — GitClear AI Copilot Code Quality Report 2025 · 211M changed lines · field experience 18 Code smarter, not harder How AI coding tools boost productivity — and where they don't

PHASE 8 — OPS · INCIDENT RESPONSE · DEBUG HERO
NUMBERS −40 to −70% MTTR in AI-augmented pilot teams 13,000 h engineering hours saved · Uber Genie Co-Pilot since Sep 2023 CAVEAT +30% manual toil despite AI investment. Tool-sprawl is the new bottleneck. Smart move: fix your alerting before adding AI on top. AIOps amplifies noise just as well as signal. Sources — Rootly 2025 State of DevOps · Uber Engineering · Digital Applied 2026 · incident.io 19 Code smarter, not harder How AI coding tools boost productivity — and where they don't

Where AI Actually Helps — A Cheat Sheet HIGH LEVERAGE
Greenfield implementation Code review (pre-pass) Incident triage & correlation User-story / doc drafting Test scaffolding (then mutate) LOW / NEGATIVE LEVERAGE Complex legacy refactors Cross-decision architecture Domain-heavy requirements (no glossary) "Vibe maintenance" AIOps on top of broken alerting Smart isn't "more AI". Smart is AI in the right place — and knowing when to keep it out. 20 Code smarter, not harder How AI coding tools boost productivity — and where they don't

Act 3 From Vibes to Specs 21 Code smarter, not

The Four Pillars of AI Coding Red Hat’s framework —
and where most of us are stuck Specs Explicit intent. Repeatable. Team- shareable. Skills Reusable agent capabilities. Composable. Agents Plan, execute, iterate. Autonomously. Most teams ship from Vibes. The wins are in Specs. Vibes Intuitive, conversational. Fast — until it isn't. Source — Red Hat Developer · "Vibes, specs, skills, and agents" · March 2026 22 Code smarter, not harder How AI coding tools boost productivity — and where they don't

SPEC-DRIVEN DEVELOPMENT — THE BIGGEST LEVERAGE UPGRADE OF 2026 Stop
coding. Start specifying. Vibe coding hits a wall around a few hundred lines. Agents guess at unstated requirements. The fix: define behaviour, constraints, acceptance criteria — then let the agent implement against that contract. THE INVERSION YESTERDAY prompt → code → fix → re-prompt TODAY spec → generate → verify → merge Once the spec is solid, AI agents become interchangeable. The speedup comes from alignment — not faster typing. Source — Microsoft Developer Blog · GitHub Spec Kit · Sept 2025 23 Code smarter, not harder How AI coding tools boost productivity — and where they don't

THE WORKFLOW — FROM IDEA TO PR IN 4 PHASES
01 Specify Business context. Success criteria. The what. spec.md 02 Plan Architecture choices. Tech stack. The how. plan.md 03 Tasks Decomposition. Testable units. The steps. tasks.md 04 Implement Agent under contract. You review. The PR. → Pull Request THE CONSTITUTION Immutable principles across every session — your persistent contract with the agents. CHECKPOINTS Cross-artefact consistency runs before implement — not after. Source — github/spec-kit · ~100k★ · Kiro Specs · Tessl 24 Code smarter, not harder How AI coding tools boost productivity — and where they don't

SDD IN PRACTICE — THREE LEVELS OF RIGOR LEVEL 1
Spec-first Persistent context for every session. No automation. Where most teams start. AGENTS.md CLAUDE.md .cursorrules LEVEL 2 — SWEET SPOT Spec-anchored Spec evolves with code. Slash commands, checkpoints, cross-artefact consistency. GitHub Spec Kit ~100k★ · 30+ agents supported Kiro (AWS) Agentic IDE · EARS notation LEVEL 3 Spec-as-source Humans only edit specs — never generated code. Generated files marked DO NOT EDIT . Tessl Private beta · spec is the source Most teams in May 2026 sit between Level 1 and Level 2. If you take one thing from this talk: write an AGENTS.md tonight. 25 Code smarter, not harder How AI coding tools boost productivity — and where they don't

Two Rules Worth Stealing Simon Willison · creator of Datasette
· co- creator of Django: "I won't commit code I couldn't explain to someone else." → Forces understanding. Kills hallucinated dependencies. Catches silent bugs. Addy Osmani · Google: Beware "house of cards code". → Fragile AI output that collapses under scrutiny. Specs in workflows prevent it. 26 Code smarter, not harder How AI coding tools boost productivity — and where they don't

Act 4 From Local to Cloud 27 Code smarter, not

The 4-Year Shift Where does AI code actually come from?
2022 Autocomplete in IDE 2023 Chat side panel 2024 Agents in IDE 2025/26 Async Cloud Agents in your PR queue The interface to AI is no longer the cursor. It's the pull request. 28 Code smarter, not harder How AI coding tools boost productivity — and where they don't

The Cloud Agent Landscape May 2026 Devin · Cognition Production
at Goldman Sachs, Citi, Nubank, Dell. Valuation $10.2B → $25B in talks (Apr 2026). OpenAI Codex Cloud 4M+ weekly developers (Apr 2026 · OpenAI announcement). 10× growth since Aug 2025. GitHub Copilot Coding Agent Assign an issue → get a PR. ~1.2M PRs/month. CODEOWNERS, branch protection apply. Cursor 3 ("Glass") April 2026. Parallel agents in worktrees + cloud sandboxes. 30% of Cursor's own merged PRs from background agents. Also in the field: Claude Code (via GitHub Action) · Google Jules · Sourcegraph Amp · Tembo. Sources — Vendor announcements · OpenAI 21.4.2026 · Bloomberg 23.4.2026 · InfoQ 2.4.2026 29 Code smarter, not harder How AI coding tools boost productivity — and where they don't

DEVIN · COGNITION — FROM DEMO TO PRODUCTION VALUATION ·
13 MONTHS $4B → $10.2B → $25B Mar 2025 · Sep 2025 · Apr 2026 (in talks) NUBANK — PRODUCTION CASE 6M-line monolith · ~100K data classes Parallel Devin sessions migrated the ETL stack. 12× efficiency, 20× cost saved. 18-month plan → shipped in weeks. 40 min → 10 min per subtask. THE HONEST CAVEAT Async is powerful. It's not autopilot. narrow + well-specified → ships reliably ambiguous + cross-cutting → senior engineer reviewing every step Cloud agents reward Act 3. Bad specs = expensive garbage. Sources — cognition.ai/customers/nubank · Bloomberg Apr 2026 · SiliconANGLE 30 Code smarter, not harder How AI coding tools boost productivity — and where they don't

THE PR IS THE NEW INTERFACE YESTERDAY Author → TODAY
Editor-in-Chief OLD LOOP open IDE · write · test · commit · push · PR · wait for review NEW LOOP write spec · assign agent · do other work · agent opens PR · you review · iterate · merge 31 Code smarter, not harder How AI coding tools boost productivity — and where they don't

Act 5 Roles are changing — all of them 32
Code smarter, not harder How AI coding tools boost productivity — and where they don't

The Companies Already Moved Product Engineers, not "developers" Linear No
traditional PMs. ~25 engineers, 1 head of product. PostHog Same playbook. Published the playbook. Vercel "Code-last" philosophy. Outcomes > commits. Stripe Early pioneer. High-ownership engineering. Shopify Product engineers shipping product, not features. incident.io JD: "outcomes & impact > exact implementation" "In an AI-first era, product engineering is more important than ever. Dare I say — it's basically the only thing left." — Lee Robinson · Vercel / Cursor 33 Code smarter, not harder How AI coding tools boost productivity — and where they don't

THE JUNIOR CRISIS — STANFORD "CANARIES IN THE COAL MINE"
· NOV 2025 AGE 22–25 −20% software-developer employment since late-2022 peak AGE 35–49 · SAME ROLES +6–9% in the same AI-exposed roles AI rewards existing expertise. It punishes "junior implementer". Routine implementation evaporates first. The question this opens: If junior pipeline collapses, who's the senior in 5 years? Source — Brynjolfsson, Chandar, Chen · Stanford Digital Economy Lab · ADP payroll data · 3.5–5M workers 34 Code smarter, not harder How AI coding tools boost productivity — and where they don't

The Team Shifts Too — Not Just Engineers QA →
Quality Owner 58% of enterprises upskill QA in AI. QA-engineer roles +17% YoY vs. devs +9%. Test-author → quality strategist. PM → Builder Linear: no PMs. PMs that ship prototype to stakeholder review in <10 minutes using Claude Code & v0. Designer → Frontend Author v0, Galileo V3, paper.design — Figma → production- ready frontend code. Designer's deliverable becomes a PR. SRE → Platform Multiplier DORA: 90% of orgs have a platform · 76% a dedicated team. Platform quality decides whether AI helps or hurts. Every role on the team is moving up the abstraction stack. The deliverable changes — but the seat in the room stays human. Sources — World Quality Report 2025 · Lenny on Linear · Productside · DORA 2025 35 Code smarter, not harder How AI coding tools boost productivity — and where they don't

WHAT STAYS HUMAN AI replaces tasks. Not people. The realistic
risk isn't being replaced by AI. It's being out- competed by someone on your team who uses it smarter. 36 Code smarter, not harder How AI coding tools boost productivity — and where they don't

So what do you do on Monday? 37 Code smarter,
not harder How AI coding tools boost productivity — and where they don't

Three Concrete Things — Starting Monday 1 Write your first
real spec A CLAUDE.md , AGENTS.md , or .cursorrules for your most active repo. Treat it like onboarding for a new hire. 2 Pick one SDLC phase to optimise Not "use more AI everywhere". Pick test, or review, or incident triage — and measure the change for 2 weeks. 3 Run a cloud agent on one real backlog item Pick a Copilot / Codex / Devin / Jules task. Let it open the PR. Review like a senior would. Notice what you'd actually ship. 38 Code smarter, not harder How AI coding tools boost productivity — and where they don't

Smart isn't "more AI". Smart is AI in the right
place, at the right time, and knowing when not to use it at all. 39 Code smarter, not harder How AI coding tools boost productivity — and where they don't

Thank you! Questions? linktr.ee/daniel_sogl thinktecture.com [email protected] Slides & socials

Code smarter, not harder | MD DevDays 2026

Code smarter, not harder | MD DevDays 2026

More Decks by Daniel Sogl

Other Decks in Programming

Featured

Transcript