Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Advancing with Java

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

Advancing with Java

Avatar for Rodrigo Graciano

Rodrigo Graciano

March 06, 2026
Tweet

More Decks by Rodrigo Graciano

Other Decks in Technology

Transcript

  1. Beyond Coding Taming AI Agents, Static Analysis, and Automated Testing

    to Reclaim Developer Productivity RODRIGO GRACIANO & CHANDRA GUNTUR Advancing with Java
  2. The Pipeline Turns Red Not a 'one test failed' red.

    A full Christmas-tree red. PR CI Deploy
  3. You Discover You Hit the Trifecta The PR can possibly

    fail for three different reasons. Security Dependency CVE • High risk • Blocked merge Quality Gate Static analysis findings • New critical findings • Suppressions Coverage 0.7% below threshold • Gate says “no” • Write tests now
  4. What Happens Next? This is where velocity goes to die.

    1. Open 5 browser tabs ( or ask your favorite agent) 2. Copy the CVE into chat 3. Ask in Slack: “false positive?” 4. Context-switch … then code quality fixes and better tests Late feedback is expensive feedback.
  5. An Alternate Path What if: • … this could be

    handled differently with early detection • … we could then “shift-left” the handling of such problems
  6. Same PR. Different Mindset. Move signals earlier so fixes happen

    while you still have context. Late feedback (pain) CI discovers issues after merge Dev loses context Fixes are rushed and risky Compliance feels like a tax Early feedback (flow) IDE detects issues pre-emptively Fix in the same mental model Automation increases resolution speed Compliance becomes default path
  7. Roadmap We’ll revisit that Friday merge at each chapter. Chapter

    1 • AI agents • AGENTS.md guardrails • Definition of Done Chapter 2 • Static analysis • Security & SCA • Dependency automation Chapter 3 • Tests & coverage • Mutation testing • Quality > numbers
  8. Meet the Speakers Rodrigo Graciano JUG Leader @NYJavaSIG & @GardenStateJUG

    • Blog: graciano.dev • Code github.com/rodrigolgraciano • X: @RodrigoGraciano Chandra Guntur Java Champion · JUG Leader @GardenStateJUG & @NYJavaSIG • Blog: cguntur.me • Code: github.com/c-guntur • X: @CGuntur
  9. A Typical Developer's Day What does a developer actually spend

    their time on? Writing Code About 8h a day, right ?! CI/CD Builds, deployments, pipeline maintenance Meetings 2-4h a day in standups, reviews, syncs? User Requests Bug reports, support escalations, ad hoc asks Juggling Deadlines Vulnerabilities, upgrades, audits, tech debt
  10. ~20% ~30% The Developer’s Time Where Time Actually Goes Coding

    Or less — actual feature development Meetings & Requests Communication and coordination overhead What Drains the Rest • Handling security vulnerabilities and dependency upgrades • Compliance audits and policy enforcement • Quality metrics: coverage gaps, tech debt, security findings • Manual refactoring and modernization work
  11. The Good News The tooling ecosystem is evolving fast —

    and AI is now a first-class participant. The “em-dash” was intentionally added. Tools Evolving to Help Both open-source and commercial solutions targeting developer productivity AI Agents Can Help AI coding agents can automate compliance, testing, and modernization tasks One Unified Goal Productivity + Compliance + Quality, without the manual grind CHAPTER 1
  12. What Belongs in AGENTS.md Keep it short, concrete, and testable.

    Agents comply best when requirements are measurable. Setup Commands Exact build, test, and lint commands the agent can run Definition of Done Tests + lint + format + security checks must all pass Quality Gates "Must satisfy Sonar rules / 0 new Criticals / Qodana pass" PR Expectations Tests updated, changelog noted, suppressions justified Practical Strategy Write a solid root AGENTS.md, then copy the top 10– 20% most critical rules into each tool's native file: .github/copilot-instructions.md, CLAUDE.md, and .cursorrules. Every agent gets the same non- negotiables. In monorepos, place a stricter AGENTS.md inside a subdirectory (e.g., /backend) to override the root for that subtree — the nearest file wins.
  13. Scenario: The Agent’s PR The agent fixed a bug… and

    accidentally broke standards. What Devs see “Looks fine to me.” Small change. Clean diff. Merged fast… What CI sees Style violations New static analysis findings Coverage regression
  14. Why Agents Go Off-Rails Not malicious. Just missing your constraints.

    No shared rules Every tool behaves differently No 'Definition of Done' Not measurable Vague instructions Hard to verify Late feedback Find issues in CI Rework + context switch
  15. A Minimal AGENTS.md Template Short • concrete • testable. Start

    here, then iterate. # AGENTS.md (root) ## Setup - Use Java 21 (or org standard) - Build: ./mvnw -q -DskipTests=false test - Lint/format: ./mvnw -q spotless:check ## Definition of Done (non-negotiable) - All tests pass (unit + integration where applicable) - No new Blocker/Critical issues in CI - Coverage does not decrease for touched code - No new high-severity vulnerabilities ## PR expectations - Update/add tests for behavior changes - Explain any suppression (false positive + narrow scope) Tip: keep this under ~60 lines. Agents follow short docs best.
  16. Make It Measurable Agents comply best when they can verify

    success. Vague “Improve code quality.” “Fix security issues.” “Add more tests.” Testable “0 new Critical Findings in CI.” “No coverage decrease.” “Add 3 tests: happy / failure / edge.” If you can’t test it, the agent can’t reliably do it.
  17. One Source of Truth + Thin Adapters Root AGENTS.md →

    copy the top 10–20% into each tool’s native file. AGENTS.md (root) Adapters (tool-specific) .github/copilot-instructions.md CLAUDE.md .cursorrules …others as needed Intent: Every agent sees the same non-negotiables.
  18. An Agent Workflow That Doesn’t Scare You Treat the agent

    like a junior dev: small steps + verification. 1. Agent proposes plan (no code yet) 2. Agent makes a small change set 3. Agent runs build + tests + linters 4. Agent fixes findings (no config edits) 5. Human reviews + merges Guardrails reduce rework and make agents safe in enterprise.
  19. Chapter 1: Takeaway Security and quality scale when the fix

    path is fast. AI agents become reliable development partners when teams provide: • clear rules • measurable quality gates • a shared Definition of Done
  20. Static Analysis & Security Checkstyle Style and convention enforcement, highly

    configurable Dependabot Automated dependency upgrades via GitHub PRs Common findings: SQL injection, path traversal, hardcoded credentials, IaC misconfigs, sensitive data exposure via entities, and OSS CVEs. Snyk OSS vulnerability scanning with actionable fix guidance Qodana JetBrains — deep Java/Kotlin analysis, CI-ready quality gates CHAPTER 2
  21. Scenario: “We Didn’t Touch That Dependency” Yet the build is

    blocked by a vulnerability. What dev says “This is a tiny feature.” “Why is security blocking us?” “Can we ignore it just this once?” What the org needs Known risk reduced Audit trail of remediation Consistent enforcement Tools help when they provide a clear, fast fix path.
  22. Security Signals: What’s What? Different tools, different problems. Don’t treat

    them as one bucket. SCA Software Component Analysis Dependencies / CVEs SBOM-friendly Fast wins SAST Static Application Security Testing Code patterns Injection, traversal False positives exist Secrets Keys/tokens in code Pre-commit/PR Rotate + revoke IaC / Config Terraform/K8s policies Misconfig detection Policy-as-code Runtime (bonus) Firewalls Running App Self Protection (RASP,) monitoring Not a substitute for SAST/SCA
  23. Shift-Left Placement Put the cheapest checks closest to the developer.

    IDE Fast hints quick feedback lint/basic scans PR Actionable review security quality comments with fix guidance CI Enforced gates enforce gates no negotiation Release Audit trail release report evidence for auditors
  24. Dependency Upgrades Without Pain Automate the tedious parts. Humans review

    the risky parts. Automation Scheduled upgrade PRs Grouped updates (by ecosystem) Auto-merge for low-risk patches Changelogs + CVE context Human judgment Major upgrades Behavior changes Breaking transitive updates Rollout strategy Rule of thumb: automate PR creation, not risk decisions.
  25. Triage Without Chaos A small, consistent policy beats ad-hoc debates.

    Good defaults Block new high-severity issues Fix what you touch Timebox backlog reduction Prefer upgrades over suppressions Suppressions allowed only if… Confirmed false positive Narrowest possible scope Comment explains why Ticket/link for follow-up
  26. You are fixing a security finding in this repo. Constraints

    (non-negotiable): - Do NOT change qodana.yaml / sonar config / CI workflows. - Fix root cause; do NOT suppress unless confirmed false positive. - Run: ./mvnw test (and include output summary). - Add/adjust tests for the behavior change. Task: - Remediate the finding in FooService (path shown in report). - Explain the fix in 3 bullet points. Prompt pattern Using an Agent to Remediate Safely Give the agent guardrails: what to fix, what NOT to touch.
  27. Example Gate: Qodana (in AGENTS.md) Short, explicit, and merge-blocking. ##

    Must-pass quality gates 1. Qodana: PASS with 0 new issues. 2. Tests: green before PR. ## Non-negotiable rules - Treat findings as merge blockers. - Respect qodana.yaml — do not modify. - Fix root cause; do NOT suppress. - Only add @SuppressWarnings if: - confirmed false positive AND - narrowest possible scope AND - comment explains why. ## Run locally qodana scan --fail-threshold 0 # or: ./gradlew qodanaScan
  28. Example Gate: SonarQube (in AGENTS.md) Make the definition of done

    unambiguous. ## Definition of Done - Pass SonarQube Quality Gate in CI. - Zero new Blocker/Critical/Major issues. - Fix Sonar issues in any code you touch. - No //NOSONAR without justification. ## Sonar-friendly coding rules - Small methods; low cognitive complexity. - Always use try-with-resources. - Never swallow exceptions. - No eager log string construction. - Keep tests updated with behavior. ## PR notes "SonarQube: PASS (no new issues)" “policy as code” beats tribal knowledge.
  29. Chapter 2: Takeaway Security and quality scale when the fix

    path is fast. • Separate signals (SCA vs SAST vs secrets) • Shift-left: IDE + PR feedback, CI enforcement • Automate dependency PRs; humans review risk • Use agents for remediation — with guardrails
  30. Unit Testing, Coverage & Quality JUnit 5 The de-facto standard.

    Scaffolded by Copilot with proper Arrange/Act/Assert structure. Diffblue Cover AI-generated Java tests. Speeds up coverage but often produces skeleton tests — always review. PiTest Mutation testing validates test quality, not just coverage. Slower runs, best in CI. High Coverage ≠ High Quality If your tests never fail, do they really test? Aim for meaningful assertions, not line counts. Diffblue tests are built to pass — verify they actually catch regressions. AGENTS.md testing rule: Every public method needs at minimum 3 tests — one happy path, one failure path (assertThrows), and one boundary/edge case. Name tests as method_whenCondition_thenResult. CHAPTER 3
  31. Scenario: Coverage Gate Fails The PR is correct… but you

    don’t have proof. What happens Coverage dips by a tiny amount CI blocks merge Team scrambles for tests Root cause Tests weren’t part of the flow Low confidence in behavior Quality gate becomes a fight
  32. High Coverage ≠ High Quality If your tests never fail,

    do they really test? Coverage tells you… What lines ran What didn’t crash A rough signal Quality tells you… Behavior is asserted Regressions get caught Edge cases are covered Goal: reduce uncertainty, not chase numbers.
  33. A Practical Test Strategy (Services) Mix fast unit tests with

    a few high-value integration tests. Unit (many) Pure functions • Business rules • Fast feedback Integration (some) DB + HTTP clients • Testcontainers where useful • Real wiring Contract/E2E (few) Critical paths only • Run in CI/nightly • High signal Mutation (many) Unit test quality • Run locally/CI • Fast feedback
  34. AI-Assisted Testing (Human-in-the-Loop) Let the agent scaffold; you provide intent

    and assertions. 1. Agent generates a test skeleton (Arrange/Act/Assert) 2. You add meaningful assertions and edge cases 3. Agent refines and runs tests until green 4. You review for readability + maintainability
  35. Mutation Testing: Does Your Test Suite Fight Back? A mutation

    tool changes your code by introducing mutants. Good tests should fail when code altered. What it catches Missing assertions Over-mocked tests Logic that isn’t validated How to use it Run on key modules In CI/nightly (it’s slower) Track killed vs survived mutants If mutants survive, your tests might be lying.
  36. Flaky Tests Kill Trust Once Devs stop trusting CI, quality

    gates stop working. Common causes Time / randomness Shared state Order dependencies External services Good practices Deterministic tests Isolated fixtures Use containers for dependencies Quarantine & fix quickly
  37. Chapter 3: Takeaway Make tests part of the flow —

    not a tax at the end. • Coverage gates are fine, but pair them with easy test scaffolding • Prefer high-signal assertions over line-count heroics • Use mutation testing on critical modules • Fix flakiness fast or Devs stop trusting CI
  38. A Monday Adoption Plan (30 / 60 / 90) Start

    small. Prove value. Then scale. 30 days Pick 1 service/module Add AGENTS.md (minimal) Turn on 1–2 gates Automate dependency PRs 60 days Expand to team repos Add test scaffolding flow Triage policy + suppressions Track rework reduction 90 days Bake into CI templates Run mutation tests nightly Modernization recipes Make compliance boring
  39. Toolchain Map (Where Each Fits) The point isn’t tools. It’s

    a workflow. IDE / local Fast linting • quick scans • agent-assisted edits PR checks Actionable comments • dependency PRs • lightweight gates CI gates Hard enforcement • evidence • consistent policy AGENTS.md (contract) Definition of Done Commands to run Quality + security rules Test expectations Suppression policy
  40. Final Thought We’re not optimizing for tools. We’re optimizing for

    flow. Make compliance the default path; not an interrupt. Thank you — Q&A next.