Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Are You Tall Enough for This Ride? Real-world C...

Are You Tall Enough for This Ride? Real-world Challenges in Code Generation

He just released a patch release in smithy4s. Little did he know, a “simple bugfix” would turn into a compilation error for several dozen developers, faster than he could say “binary compatibility.”

Our tale of woe begins with an innocent, one-line change to companion objects, continues through the inevitable GitHub issues from confused users, and culminates in a classic engineer’s dilemma: break backward compatibility or keep the bugs? (spoiler: we chose the secret option C).

Through this real-world disaster-turned-teaching-moment, we’ll navigate the double-edged sword of code generation - powerful enough to create cross-language platforms with ease, yet temperamental enough to bring down your API with a single diff. We’ll demonstrate how a cleverly concealed macro saved our users from frustration while preserving our bugfix integrity.

Come for the horror story of wicked workarounds, stay for the practical strategies on taming generated code with scalameta, snapshot tests, and knowing when you’re “tall enough” for the codegen rollercoaster. Because sometimes in Scala, the real bugs were the friends we made along the way.

Avatar for Jakub Kozłowski

Jakub Kozłowski

August 20, 2025
Tweet

More Decks by Jakub Kozłowski

Other Decks in Programming

Transcript

  1. Real-world Challenges in Code Generation August 20, 2025 | Scala

    Days | Lausanne Are You Tall Enough for This Ride?
  2. sealed trait Podcast extends Product with Serializable ... object Podcast

    extends ShapeTag.Companion[Podcast] { - object Video extends ShapeTag.Companion[Video] { + object Video { val id: ShapeId = ShapeId("smithy4s.example", "Video") val hints: Hints = Hints.empty } It all starts with a 1-line diff…
  3. 📓 how this story unfolded 🤔 how import scala.quotes.* saved

    us 📚 how to tell if codegen is your thing ` ` In this talk, we will show you
  4. A few years ago, The Company decided to build a

    new streaming platform 🌱 fresh start, lots of greenfield ✉️ event-driven microservices, high degree of granularity tm
  5. tens of APIs/services to build several programming languages / tech

    stacks multiple platforms (backend, mobile, web, automotive…) documentation, event persistence, backward compatibility… How to keep it manageable? 🤯 Lots of events == lots of trouble
  6. (Interface Definition Language) 📜 Use a common schema language to

    describe all cross-service communication ➡️ Generate code, documentation and other outputs from a single source of truth ⚙️ Automate compatibility checks One possible solution: an IDL
  7. IDL Client code Server code Documentation CLIs API History Compatibility

    checks Table schemas ... But wait, there’s more!
  8. language- and protocol-agnostic small, but extensible language actively maintained, mature

    baseline good tooling ✅ ✅ ✅ ✅ The Company 's IDL of choice! tm @http(method: "GET", uri: "/weather/{city}") @readonly operation GetWeather { input := { @required @httpLabel city: String } output := { @required weather: Weather } } structure Weather { @required description: String @required degrees: Celsius } @range(min: 0) integer Celsius Smithy
  9. "Smithy for Scala" Provides code generation + runtime support 7

    @default("some rando") 1 $version: "2" 2 3 namespace input 4 5 structure Person { 6 @required 8 name: String 9 } 3 final case class Person(name: String = "some rando") 12 smithy.api.Default( 13 smithy4s.Document.fromString("some rando") 14 ) 1 package input 2 4 5 object Person extends ShapeTag.Companion[Person] { 6 val id: ShapeId = ShapeId("input", "Person") 7 8 implicit val schema: Schema[Person] = struct( 9 string 10 .required[Person]("name", _.name) 11 .addHints( 15 ), 16 )(make).withId(id).addHints(hints) 17 } Smithy4s
  10. User code Company tooling OSS Tooling smithy smithy4s API monorepo

    Company protocol Applications How we did it (4/6)
  11. User code Company tooling OSS Tooling smithy smithy4s API monorepo

    Company protocol ??? E2E Tests Applications How we did it (5/6)
  12. User code Company tooling OSS Tooling smithy smithy4s API monorepo

    Company protocol E2E Library E2E Tests Applications How we did it (6/6)
  13. Damian wants to write an e2e test he already has

    a data model he already has an API client generated from it Meet Damian
  14. EventBus GreetingApi Test EventBus GreetingApi Test greetUser("ScalaUser") emit UserGreeted(name =

    "ScalaUser") consume UserGreeted event Damian wants to write a test
  15. 7 _ <- testing.events.expectEvent(DomainEvent.UserGreeted){ 8 user => user.name == userName

    9 } 1 object MySuite extends SimpleIOSuite { 2 3 test("Expect API call emits event") { 4 val userName = "ScalaUser" 5 for { 6 _ <- testing.apis.greetingService.greetUser(userName) 10 } yield success 11 } 12 13 } And here’s the test code
  16. Notice the extends ShapeTag[UserGreeted] 4 case class UserGreeted(name: String) extends

    DomainEvent 6 object UserGreeted extends ShapeTag[UserGreeted]{ 7 def id: smithy4s.ShapeId = smithy4s.ShapeId("thecompany.users", "UserGreeted") 8 def schema: Schema[DomainEvent.UserGreeted] = ??? 9 } 1 trait DomainEvent 2 object DomainEvent extends ShapeTag[DomainEvent] { 3 5 10 11 override def id: ShapeId = ??? 12 override def schema: Schema[DomainEvent] = ??? 13 } ` ` Domain - generated code
  17. 2 val streamName = resolveEventStreamName(tag.id) 1 def expectEvent[A](tag: ShapeTag[A])(predicate: A

    => Boolean): IO[Unit] = { 3 val eventStream = streamEvents[A](streamName) 4 eventStream 5 .find(predicate) 6 .timeout(readTimeout) 7 .compile 8 .last 9 .flatMap { 10 case Some(value) => IO.unit 11 case None => 12 IO.raiseError( 13 new Exception("Could not find an event matching the predicate") 14 ) 15 } 16 } Consuming events
  18. The stream name is resolved based on shape ID Which

    is obtained from event companion object Which extends ShapeTag that enforces id: smithy4s.ShapeId private def resolveEventStreamName(id: ShapeId): StreamName testing.events.expectEvent(DomainEvent.UserGreeted)( _.name == userName ) ` ` ` ` object UserGreeted extends ShapeTag[UserGreeted]{ def id: smithy4s.ShapeId = smithy4s.ShapeId("thecompany.users", "UserGreeted") def schema: Schema[DomainEvent.UserGreeted] = ??? } Resolving stream name
  19. 1 object MySuite extends SimpleIOSuite { 2 3 test("Expect API

    call emits event") { 4 val userName = "ScalaUser" 5 for { 6 _ <- testing.apis.greetingService.greetUser(userName) 7 _ <- testing.events.expectEvent(DomainEvent.UserGreeted){ 8 user => user.name == userName 9 } 10 } yield success 11 } 12 13 } But Damian only cares about the test code
  20. Daily maintainance tasks happen… Suddenly, a user bug report! I’m

    gonna work on this! ~a Smithy4s maintainer Meanwhile, in Smithy4s HQ
  21. @baccata: ADT members shouldn't have implicit ShapeTags of their own.

    If you widen the member's type (or not), your program's behavior will change drastically!* @kubukoz: Agreed, let me remove them! * - Rephrased for brevity Inside the comment section
  22. Consider this schema @adt - inline SampleStruct into the companion

    object of UnionBeforeChange @adt union UnionBeforeChange { s: SampleStruct } structure SampleStruct { @required name: String } ` ` ` ` ` ` sealed trait UnionBeforeChange object UnionBeforeChange extends ShapeTag[UnionBeforeChange] { case class SampleStruct(name: String) object SampleStruct extends ShapeTag[SampleStruct] { ... } ... }
  23. Using the wrong ShapeTag of the type can be dangerous

    1 def writeJson[A: ShapeTag](a: A) = Json.writeBlob(a)(using ShapeTag[A].schema).toUTF8String 6 writeJson(UnionBeforeChange.sampleStruct("foo")) 7 // ⚠️ encoded as struct: {"name":"foo"} 2 3 writeJson(UnionBeforeChange.sampleStruct("foo"): UnionBeforeChange) 4 // ✅ encoded as union: {"s":{"name":"foo"}} 5
  24. sealed trait Podcast extends Product with Serializable ... object Podcast

    extends ShapeTag.Companion[Podcast] { - object Video extends ShapeTag.Companion[Video] { + object Video { val id: ShapeId = ShapeId("smithy4s.example", "Video") val hints: Hints = Hints.empty } ShapeTag gets removed from ADT union members
  25. Only one ShapeTag for ADT unions! LGTM, ship it 1

    writeJson(UnionAfterChange.sampleStruct("foo"): UnionAfterChange) 2 // ✅ encoded as union: {"s":{"name":"foo"}} 3 4 writeJson(UnionAfterChange.sampleStruct("foo")) 5 // ❌ Error: No given instance of type ShapeTag[UnionAfterChange.SampleStruct] was found
  26. (as one does) due to an E2E library bugfix in

    the meantime Damian tries to update dependencies
  27. You know why. [error] Found: domainbeforesmithychange.DomainEvent.UserGreeted.type [error] Required: smithy4s.ShapeTag[A] [error]

    [error] where: A is a type variable [error] _ <- testing.events.expectEvent(DomainEvent.UserGreeted){ - object UserGreeted extends ShapeTag.Companion[UserGreeted] { + object UserGreeted { Suddenly things break!
  28. A: change the E2E testing API B: revert the ShapeTag

    change in Smithy4s ` ` 🤔 What now?
  29. Break compatibility of the expectEvent method? 😩 very cumbersome for

    users 💥 affects hundreds of tests in multiple repos (multiple teams) ⚙️ Scalafix migration is an option, but still not super convenient ` ` // before testing.events.expectEvent(DomainEvent.UserGreeted){ user => user.name == userName } // after testing.events.expectEvent(DomainEvent.UserGreeted.schema){ user => user.name == userName } A: change the E2E testing API
  30. 🐛 restores the potential footgun (wrong implicit being used) ⚠️

    against the Smithy4s principles (correctness) 🚩 bending Open Source principles (catering to the needs of one commercial adopter) sealed trait Podcast extends Product with Serializable ... object Podcast extends ShapeTag.Companion[Podcast] { - object Video { + object Video extends ShapeTag.Companion[Video] { val id: ShapeId = ShapeId("smithy4s.example", "Video") val hints: Hints = Hints.empty } B: revert the ShapeTag change in Smithy4s ` `
  31. 💰 Minimize total cost in the org 🔄 Reduce time

    spent on coordination, priorities, communication, people involved, approvals 🫣 Keep complexity in the library, away from the users Least painful option
  32. 4 case class UserGreeted(name: String) extends DomainEvent 6 object UserGreeted

    extends ShapeTag[UserGreeted]{ 7 def id: smithy4s.ShapeId = smithy4s.ShapeId("thecompany.users", "UserGreeted") 8 def schema: Schema[DomainEvent.UserGreeted] = ??? 9 } 1 trait DomainEvent 2 object DomainEvent extends ShapeTag[DomainEvent] { 3 5 10 11 override def id: ShapeId = ??? 12 override def schema: Schema[DomainEvent] = ??? 13 } Codegen before Smithy4s change
  33. 4 case class UserGreeted(name: String) extends DomainEvent 6 // no

    longer extends ShapeTag, but still has id and schema 7 object UserGreeted { 8 def id: smithy4s.ShapeId = smithy4s.ShapeId("thecompany.users", "UserGreeted") 9 def schema: Schema[DomainEvent.UserGreeted] = ??? 10 } 1 trait DomainEvent 2 object DomainEvent extends ShapeTag[DomainEvent] { 3 5 11 12 override def id: ShapeId = ??? 13 override def schema: Schema[DomainEvent] = ??? 14 } Codegen now
  34. 🙅‍♀️ The ShapeTag is no longer there in the codegen,

    we can no longer use it! 1 def expectEvent[A](tag: ShapeTag[A])(predicate: A => Boolean): IO[Unit] = { 2 val streamName = resolveEventStreamName(tag.id) 3 val eventStream = streamEvents[A](streamName) 4 eventStream 5 .find(predicate) 6 .timeout(readTimeout) 7 .compile 8 .last 9 .flatMap { 10 case Some(value) => IO.unit 11 case None => 12 IO.raiseError( 13 new Exception("Could not find an event matching the predicate") 14 ) 15 } 16 } ` ` Testing framework before
  35. EventType is essentially Any 1 def expectEvent[Companion <: { def

    id: ShapeId }, EventType] 2 (companion: Companion)(predicate: EventType => Boolean): IO[Unit] = 3 ??? [error] value name is not a member of Any, but could be made available as an extension method. [error] [error] The following import might make progress towards fixing the problem: [error] [error] import smithy4s.Document.syntax.fromSchema [error] [error] user => user.name == userName [error] ^^^^^^^^^ ` ` ` ` Looking for the new signature
  36. Companion is related to EventType now 1 def expectEvent[Companion <:

    { def id: ShapeId; def schema: Schema[EventType] }, EventType] 2 (companion: Companion)(predicate: EventType => Boolean): IO[Unit] = 3 ??? ` ` ` ` Relating the types
  37. 1 type ShapeTagLike[EventType] = { 2 def schema: Schema[EventType] 3

    def id: ShapeId 4 } 5 6 def expectEvent[Companion <: ShapeTagLike[EventType], EventType] 7 (companion: Companion)(predicate: EventType => Boolean): IO[Unit] = 8 ??? Little cleanup
  38. What if the user provides something weird? Now we can

    misuse the companion object 1 object SomeOtherThing { 2 def id: smithy4s.ShapeId = ??? 3 def schema: Schema[DomainEvent.UserGreeted] = ??? 4 } testing.events.expectEvent(SomeOtherThing){ user => user.name == userName } Edge cases
  39. What if we could "prove" we are working with the

    companion object? 1 def expectEvent[Companion <: ShapeTagLike[EventType], EventType]( 2 companion: Companion 3 )( 4 predicate: EventType => Boolean 5 )(using CompanionsEvidence[EventType, Companion]): IO[Unit] = 6 ??? Guardrails
  40. 1 sealed trait Proof 2 3 def check(using Proof) =

    () 4 check // ❌: won't compile by itself! [error] No given instance of type GivenEvidence.Proof was found for parameter x$1 of method [error] check in object GivenEvidence [error] check // ❌: won't compile by itself! [error] ^ Given evidence
  41. 1 sealed trait Proof 2 3 def check(using Proof) =

    () 4 5 // Provide the proof 6 given Proof = new Proof {} 7 8 check // ✅: compiles! Given evidence
  42. What if we could prove we are working with the

    companion object? If only it existed as a given out of the box 1 /** Proof that Obj is a companion of Clazz */ 2 @implicitNotFound("Could not prove that ${Obj} is a companion of ${Clazz}") 3 sealed trait CompanionsEvidence[Clazz, Obj] ` ` CompanionsEvidence
  43. Your usual compilation flow goes like this 👨‍💻 Programmer writes

    code 📝 Source Code (.scala files) 🔨 Scala Compiler (scalac) 📦 JVM Bytecode (.class files) How do macros work?
  44. Well actually 👨‍💻 Programmer writes code 📝 Source Code (.scala

    files) 🔨 Scala Compiler (scalac) 📦 JVM Bytecode (.class files) ✨ Magic happens How do macros work?
  45. Compiler land 🏴‍☠️ 👨‍💻 Programmer writes code 📝 Source Code

    (.scala files) 🔨 Scala Compiler (scalac) 📦 JVM Bytecode (.class files) 🎭 Macro Execution ⚡ Compile-time only 🔍 Inspects types 🏗️ Generates CompanionsEvidence How do macros work?
  46. Let’s make the compiler generate the given instances of CompanionsEvidence

    1 /** Proof that Obj is a companion of Clazz */ 2 @implicitNotFound("Could not prove that ${Obj} is a companion of ${Clazz}") 3 sealed trait CompanionsEvidence[Clazz, Obj] ` ` ` ` Macro in practice
  47. 13 val areCompanions = clazzCompanionRef =:= objRepr 14 if areCompanions

    15 then '{ new CompanionsEvidence[Clazz, Obj] {} } 16 else sys.error(s"${objRepr.show} is not a companion of ${clazzRepr.show}") 1 object CompanionsEvidence { 2 // produced only for Clazz Obj companion pairs 3 inline given companions[Clazz, Obj] 4 : CompanionsEvidence[Clazz, Obj] = ${ companionsImpl[Clazz, Obj] } 5 6 private def companionsImpl[Clazz: Type, Obj: Type](using 7 Quotes 8 ): Expr[CompanionsEvidence[Clazz, Obj]] = { 9 import quotes.reflect.* 10 val clazzRepr = TypeRepr.of[Clazz] 11 val objRepr = TypeRepr.of[Obj] 12 val clazzCompanionRef = clazzRepr.typeSymbol.companionModule.typeRef 17 } 18 } CompanionsEvidence ` `
  48. The incorrect code no longer compiles testing.events.expectEvent(SomeOtherThing){ user => user.name

    == userName } java.lang.RuntimeException: domainaftersmithychange.SomeOtherThing is not a companion of domainaftersmithychange.DomainEvent.UserGreeted Could not prove that domainaftersmithychange.SomeOtherThing.type is a companion of domainaftersmithychange.DomainEvent.UserGreeted. I found: frameworkaftersmithychange.testing.macros.CompanionsEvidence.companions[ domainaftersmithychange.DomainEvent.UserGreeted, domainaftersmithychange.SomeOtherThing.type] But Exception occurred while executing macro expansion. How does this help?
  49. ✅ Macros and clever use of types saved the day

    ✅ 600 tests saved ✅ The change was not binary compatible, but users didn’t notice ✅ Damian moves on Problem solved!
  50. …but it has its problems and requirements Is it a

    good fit for you? enum Boolean { case Yes case No case Maybe } Code generation can be incredibly useful
  51. 🐣 Best invest early on 🤝 IDLs work best when

    everyone is using them. ⏳ Results will take time Can you get the buy-in?
  52. 🧱 IDLs require structure 🚳 Not all existing APIs will

    fit seamlessly 👷 Customizations need maintenance Can you accept some loss of flexibility?
  53. 💂 Building standards and extensions 🧹 Housekeeping of the source

    of truth 👷 Integrating the tooling in the SDLC 🧑‍🏫 Education / documentation / support Can you afford a dedicated tooling team?
  54. 📈 Do you have the scale? 🤯 Do your APIs

    overflow your brain? 🗣️ Do you yearn for better communication? 🗯️ Do you need to support multiple tech stacks/languages? Do you need this?