Always using LLMs to automate boring, heavy-labor tasks Formal Verification ≈ Check specifications logically with a machine Continuous Integration ≈ Continuously check every new change 44
storage of item attributes randomly dropped attributes It only occurred on production There were no logs or errors at all Actual: random attributes are dropped silently System writes the change to the database No logs or errors occur Color Dark blue +Authentication Verified Color Dark blue Authentication Verified 45
is a production-only bug. There were no errors and no logs when the drop happened. Finding issues is costly, and each time only a few items were impacted. The issue was resolved after the incident was identified. 46
1. Check out the commit before the bug fix 2. No hint 3. Give symptoms 4. Give possible files 5. Ask the LLM service to build the formal spec to reveal the bug
Invariants (requirements) hold Bug found Implementation code • No time to learn new concepts • No time to apply the knowledge • Duplicated spec writing • Extra work before coding • Delivery time gets longer • Code may be buggy still • Tests are still required 51
(ambiguous) Formal spec (TLA+) ModelFuzz* Model-guided fuzz test Invariants (requirements) hold Bug found Implementation code Git commit JIRA ticket Prompts To LLM service • Spec per change • Less duplication work Coverage for every change guaranteed by fuzz test * Gulcan, E. B., Ozkan, B. K., Majumdar, R., & Nagendra, S. (2025). Model-guided Fuzzing of Distributed Systems. https://arxiv.org/abs/2410.02307 52
attribute dropping bug shows the invariant violation, fully by a LLM service within one day For Features Another TLA+ spec also fully by a LLM service for a API flow change to prove its integrity Next Step Complete the ModelFuzz setup and run it for the whole attributes data store service to showcase the method 54
+ TLA+ = Faster, logical debugging 🧠 Automation can make correctness part of the workflow 🚀 Code first, verify continuously 🌱 Next step: make “robust CI” the default.