Understanding Specification-Driven Code Generat...

March 19, 2026

Research

27

Understanding Specification-Driven Code Generation with LLMs: An Empirical Study Design

Abstract: Large Language Models (LLMs) are increasingly integrated into software development workflows, yet their behavior in structured, specification-driven processes remains poorly understood. This paper presents an empirical study design using CURRANTE, a Visual Studio Code extension that enables a human-in-the-loop workflow for LLM-assisted code generation. The tool guides developers through three sequential stages--Specification, Tests, and Function--allowing them to define requirements, generate and refine test suites, and produce functions that satisfy those tests. Participants will solve medium-difficulty problems from the LiveCodeBench dataset, while the tool records fine-grained interaction logs, effectiveness metrics (e.g., pass rate, all-pass completion), efficiency indicators (e.g., time-to-pass), and iteration behaviors. The study aims to analyze how human intervention in specification and test refinement influences the quality and dynamics of LLM-generated code. The results will provide empirical insights into the design of next-generation development environments that align human reasoning with model-driven code generation.

Preprint: https://arxiv.org/abs/2601.03878

Giovanni Rosa

March 19, 2026

More Decks by Giovanni Rosa

See All by Giovanni Rosa

Not Only for Developers: Exploring Plugin Maintenance for Knowledge-Centric Communities

0

23

Other Decks in Research

See All in Research

機械学習で作ったポケモン対戦bot で遊ぼう！

0

190

討議：RACDA設立30周年記念都市交通フォーラム2026

0

860

重要だけど測れていないもの：高齢者ケアの見えない課題

0

200

2026 東京科学大情報通信系研究室紹介 (大岡山)

0

3.2k

東京大学工学部計数工学科、計数工学特別講義の説明資料

0

380

2026年3月1日（日）福島「除染土」の公共利用をかんがえる

atsukomasano2026

0

580

CyberAgent AI Lab研修 / Social Implementation Anti-Patterns in AI Lab

6

4.4k

Ankylosing Spondylitis

0

170

Data Visualization Tools in the Age of AI

0

140

製造業主導型経済からサービス経済化における中間層形成メカニズムのパラダイムシフト

0

580

Tiaccoon: Unified Access Control with Multiple Transports in Container Networks

0

1.7k

SoftMatcha 2: 1兆語規模コーパスの超高速かつ柔らかい検索

6

3.3k

Featured

See All Featured

10 Git Anti Patterns You Should be Aware of

PRO

659

62k

So, you think you're a good person

PRO

2

2k

Building Experiences: Design Systems, User Experience, and Full Site Editing

0

500

Paper Plane (Part 1)

PRO

0

7.5k

Avoiding the “Bad Training, Faster” Trap in the Age of AI

0

140

JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022

1

440

Imperfection Machines: The Place of Print at Facebook

270

14k

Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS

0

350

PRO

1

190

Chrome DevTools: State of the Union 2024 - Debugging React & Beyond

10

1.2k

0

590

RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub

141

35k

Transcript

Understanding Specification-Driven Code Generation with LLMs: An Empirical Study Design
SoftDev Speaker: Giovanni Rosa Postdoctoral Researcher Universidad Rey Juan Carlos Giovanni Rosa, David Moreno-Lumbreras, Gregorio Robles and Jesus M. Gonzalez-Barahona
AI-Coding Assistants
Towards Spec-Driven Development Modern AI tools (Copilot & co.) Powerful
automation but guided by abstract prompts Current frontier: representing human intent via formal and structured specifications Test Driven Development Code centric, High human effort
Mathews et. al 2024 Pirya et. al 2024 Test-Driven Code
Generation Fakhoury et. al 2024
A TDD-inspired workflow to guide code generation The user iteratively
refines and validates a formal input specification (tests)
The CURRANTE Plugin
Research questions
Study Context
Experimental Protocol
Evaluation Metrics 1/2 RQ1
Evaluation Metrics 2/2 RQ1 RQ2 RQ2
Challenges & Mitigations
Expected Outcome
Summary Giovanni Rosa Postdoctoral Researcher @ URJC More at giovannirosa.com
Funded by the Spanish AEI Ref. 2024/00416/002 1 4 5 3 2 6
Acknowledgments More at https://advise.codeberg.page/ This work is part of the
ADVISE project (ADvanced Vision on Intelligent Software Engineering) ADVISE aims to integrate AI agents into software development to ensure code meets requirements and aligns with developer intent. Funded by the Spanish AEI, reference 2024/00416/002