Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Crack open sbt and rock your build times

Triplequote
September 03, 2019

Crack open sbt and rock your build times

Triplequote

September 03, 2019
Tweet

More Decks by Triplequote

Other Decks in Technology

Transcript

  1. Co-Founder of Triplequote • Goal: Make Scala compilation fast! ◦

    Triplequote Hydra (commercial multicores Scala compiler) • Hydra speedups range from 2x up to 7x ◦ Projects using FP and type level programming techniques deeply benefit from it • Try it out: triplequote.com/hydra/trial • 50% discount on ALL yearly subscriptions! ◦ Use the coupon code: SCALAWORLD2019 (expires in 7 days) But Scala compilation is NOT the only drag in our builds... My background
  2. • sbt is the de-facto build tool for Scala •

    It takes time to get sbt to the point it is ready to compile • Local and CI times often differs and it’s often not obvious why • Executing compile or test may result in the execution of a multitude of other tasks Motivation Which inefficiencies are inherent to sbt vs depend on setup?
  3. What’s in a sbt build? A sbt build is essentially

    collection of settings They are loaded from different places: Build Global plugins ${user.home}/.sbt/1.0/plugins/ Project Plugins ${PWD}/project/ Project settings Defined in the build
  4. Settings essentials • Settings generate values • Settings usually have

    dependencies • To evaluate a setting its dependencies must be evaluated first ◦ It’s a graph (more precisely, a DAG - Directed Acyclic Graph). How can we find the dependencies of a setting? How can we know where a setting is defined (e.g., what plugin contributes it)?
  5. Dependencies of a setting Run inspect tree <setting> sbt:root> inspect

    tree sourceDirectory [info] sourceDirectory = src [info] +-baseDirectory = [info] +-thisProject = Project(id root, base: /projects/scalaworld, …) sourceDirectory baseDirectory thisProject
  6. sbt:root> inspect sourceDirectory ... [info] Defined at: [info] (sbt.Defaults.paths) Defaults.scala:328

    ... Where is a setting defined? Run inspect <setting> // In sbt/Defaults.scala sourceDirectory := baseDirectory.value / "src"
  7. Build settings size Turns out a typical sbt build has

    quite a few settings • guardian/frontend@2ba8094 has 23793 • ornicar/lila@0647f7f has 29712 • akka/akka@bc4c6ba has 32047 • circe/circe@8eb2cd5 has 37007 ⇒ ~30k settings!
  8. Digression: How to find the settings size? $ sbt [info]

    Loading ... [info] Resolving key references (32047 settings) ... ... akka > consoleProject [info] Starting scala interpreter… ... scala> buildStructure.eval.settings.size res0: Int = 32047 Printed if 10k+ settings Starts the Scala interpreter with the sbt and the build definition Tip: consoleProject can be very useful to for debugging your build!
  9. • An empty sbt build has ~700 settings ◦ Roughly:

    200 global and 500 are project settings • Explosion is due to plugins + scoping axes ◦ Configuration ◦ Project ◦ Task Why so many settings? Configuration Project Task sbt:root> show root / Compile / compile / sourceDirectory [info] .../src/main sbt:root> show root / Test / compile / sourceDirectory [info] .../src/test show <proj> / <config> / <task> / <setting>
  10. Loading a build • Settings are compiled into a task

    graph • The task graph is used to execute the build • Loading the build in memory takes time
  11. Loading a build time sbt exit Local CI w/ cache

    CI w/o cache guardian/frontend (sbt 1.2.8) 14s 30s 51s akka/akka (sbt 1.2.8) 18s 40s 3m53s ornicar/lila (sbt 0.13.18) 12s 23s 6m58s circe/circe (sbt 1.3.0-RC4) 13s 24s 1m5s Local: No resolution nor metabuild compilation. Cache is fully up-to-date CI w/ cache: Cached .ivy2 and .sbt CI w/o cache: Nothing is cached
  12. 1. The ~80 dependencies needed to start sbt 2. The

    compiled compiler-bridge ◦ The compiler-bridge is the interface between sbt and the Scala compiler ◦ The compiler-bridge is a source component ◦ Compiling the bridge with a cold JVM takes about 10s What’s in the .sbt/ dir? [info] Compiling 5 Scala sources to .../scala-2.12/classes ... [info] Non-compiled module 'compiler-bridge_2.12' for Scala 2.12.8. Compiling... [info] Compilation completed in 9.759s.
  13. Caching on the CI is a must! time sbt exit

    CI w/ cache CI w/o cache guardian/frontend (sbt 1.2.8) 30s 51s akka/akka (sbt 1.2.8) 40s 3m53s ornicar/lila (sbt 0.13.18) 23s 6m58s circe/circe (sbt 1.3.0-RC4) 24s 1m5s Caching the .ivy2 and .sbt/boot directory is a must!
  14. Local vs CI times: Why are they different? But why

    is there still a relevant difference between Local and CI w/ cache? time sbt exit Local CI w/ cache guardian/frontend (sbt 1.2.8) 14s 30s akka/akka (sbt 1.2.8) 18s 40s ornicar/lila (sbt 0.13.18) 12s 23s circe/circe (sbt 1.3.0-RC4) 13s 24s
  15. > sbt // on the CI [info] Loading global plugins

    from /Users/mirco/.sbt/1.0/plugins/project [info] Loading global plugins from /Users/mirco/.sbt/1.0/plugins [info] Loading settings for project akka-build from plugins.sbt ... [info] Loading project definition from /Users/mirco/Projects/oos/akka/project [info] Updating ProjectRef(uri("file:/Users/mirco/Projects/oos/akka/project/"), "akka-build")... [info] Done updating. [warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings. [info] Compiling 34 Scala sources to /Users/mirco/Projects/oos/akka/project/target/scala-2.12/sbt-1.0/classes ... [warn] there was one feature warning; re-run with -feature for details [warn] one warning found [info] Done compiling. [info] Loading settings for project akka from build.sbt ... [info] Resolving key references (31225 settings) ... [info] Set current project to akka (in build file:/Users/mirco/Projects/oos/akka/) Loading a build on the CI
  16. The metabuild project • The metabuild is a synthetic project

    created by sbt to compile the build files • Its dependencies are usually sbt plugins ◦ plugins have dependencies on their own: resolution can be expensive • The compiled build files together with the result of resolution are stored in project/target [info] Updating ProjectRef(uri("file:/Users/mirco/Projects/oos/akka/project/"), "akka-build")... [info] Done updating. [warn] There may be incompatibilities among your library dependencies; run 'evicted' to see detailed eviction warnings. [info] Compiling 34 Scala sources to /Users/mirco/Projects/oos/akka/project/target/scala-2.12/sbt-1.0/classes ... [warn] there was one feature warning; re-run with -feature for details [warn] one warning found [info] Done compiling.
  17. CI best practices 1. Cache the .ivy2 and .sbt directories

    2. Chain tasks instead of instantiating sbt multiple times a. Do sbt task1 task2 instead of sbt task1 & sbt task2 3. Avoid SNAPSHOT a. Why? They get re-resolved every time i. If you must: set up a proxy! (more on this later) ii. In alternative: bind the dependency to a specific resolver b. sbt resolution of SNAPSHOT is buggy (local is not resolved before cache) i. e.g., sbt/sbt#2687 - supposedly fixed in 1.3.0
  18. Digression: Forcing resolution from a specific resolver updateOptions := {

    val deps2resolvers = Map( “com.org” %% “my-lib” % “1.0.0-SNAPSHOT” -> myResolver ) updateOptions.value.withModuleResolvers(deps2resolvers) } Meet UpdateOptions.withModuleResolvers
  19. sbt compile • What tasks are executed on compile? •

    Is any of the triggered tasks expensive? How can you find out? ◦ Enable the built-in profiling capabilities! (or use use sbt-optimizer)
  20. $ sbt -Dsbt.task.timings=true … // iteration: 1 (after a clean)

    [root] $ common/compile ... [info] Compiling 519 Scala sources to /frontend/common/target/scala-2.12/classes ... ProjectRef(uri("file:/frontend/"), "common") / Compile / compileIncremental : 81219 ms ProjectRef(uri("file:/frontend/"), "common") / update : 7003 ms ProjectRef(uri("file:/frontend/"), "common") / Compile / twirlCompileTemplates : 2205 ms … // iteration: 2 [root] $ common/compile ProjectRef(uri("file:/frontend/"), "common") / update : 1061 ms sbt compile • What tasks are executed on compile? • Is any of the triggered tasks expensive? How can you find out? ◦ Enable the built-in profiling capabilities! (or use use sbt-optimizer) No downloads, just resolution time. Deserializing the update report
  21. Why is resolution so slow? • sbt uses Apache Ivy

    to resolve dependencies • Ivy doesn’t scale well with large dependency graphs ◦ Increasing version conflicts ◦ Checking exclusion and override rules becomes increasingly expensive • That’s why clean is generally a terrible idea • Does setting up a proxy helps? // iteration: 1 (after a clean) [root] $ common/compile ... [info] Compiling 519 Scala sources to /frontend/common/target/scala-2.12/classes ... ProjectRef(uri("file:/frontend/"), "common") / Compile / compileIncremental : 81219 ms ProjectRef(uri("file:/frontend/"), "common") / update : 7003 ms No downloads, just resolution time.
  22. Local Proxy ⇒ Fast resolution? • A proxy helps with

    downloading artifacts but has no impact on resolution • A proxy makes you resilient to upstream repositories hiccups ◦ Hence, even if it doesn’t help with resolution, you should still set it up • To properly configure sbt to use a proxy you have to ◦ Update ~/.sbt/repositories ◦ Start sbt with -Dsbt.override.repos=true ▪ This prevents that resolvers added to the build circumvent the proxy configuration
  23. How to speed up resolution? • Option 1: Ditch Ivy

    and use Coursier ◦ It’s easy to try it out: simply use the sbt-coursier plugin ◦ But be aware that Coursier has different semantic for resolving dependencies • Option 2: Stay tuned for the announcement at the end!
  24. Code generation • Costly code generation shouldn’t happen at every

    compile • Example: Generating the Slick tables definitions from a DB $ sbt -Dsbt.task.timings=true … [root] $ root/compile ProjectRef(uri("file:/db"), "root") genTables : 2987 ms How can it be avoided?
  25. Meet FileFunction.cached • FileFunction.cached is a caching facility included in

    sbt • Execute code generation logic iff the inputs have changed • For Slick we can use DB migration files as input • No changes in the migration files ⇒ generated code is up to date!
  26. Meet FileFunction.cached • FileFunction.cached is a caching facility included in

    sbt • Execute code generation logic iff the inputs have changed • For Slick we can use DB migration files as inputs • No changes in the migration files ⇒ generated code is up to date! slickCodeGenTask := { val cachedGen = FileFunction.cached(target.value / "slickCodeGen", inStyle = FileInfo.hash) { in: Set[File] => runner.value.run("slick.codegen.SourceCodeGenerator", cp.files, Array(slickDriver, jdbcDriver, url, outputDir.getPath, pkg), s.log).failed foreach (sys error _.getMessage) val fname = outputDir + "/my/slick/tables/Tables.scala" Seq(file(fname)) } cachedGen(IO.listFiles(file("migrations")).toSet) }
  27. Source formatting • Formatting sources on compile is another typical

    overhead • If you are using sbt-scalastyle it’s even worse ◦ No built-in incremental support ◦ Following its doc blindly results in full sources’ formatting at each compile! • Once more FileFunction.cached comes to the rescue
  28. Source formatting • Formatting sources on compile is another typical

    overhead • If you are using sbt-scalastyle it’s even worse ◦ No built-in incremental support ◦ Following its doc blindly results in full formatting at each compile • Once more FileFunction.cached comes to the rescue scalaStyleIncremental: = Def.task { val cacheStore = CacheStoreFactory(target.value / "scalastyle-cached") val cachedScalaStyle = FileFunction.cached(cacheStore) { (in: ChangeReport[File], out: ChangeReport[File]) => val sourcesToFormat = if (outReport.removed.isEmpty) inReport.added.map(_.getAbsolutePath).toSeq ++ inReport.modified.map(_.getAbsolutePath).toSeq else inReport.checked.map(_.getAbsolutePath).toSeq org.scalastyle.sbt.Tasks.doScalastyle(sourcesToFormat, ...) } cachedScalaStyle((scalastyleSources.value ** "*.scala").get().toSet) }
  29. Take away • Sometimes we are to blame for the

    inefficiencies ◦ Suboptimal CI setup ◦ Avoid hooking expensive tasks to compile or test • While other times the inefficiencies are inherent to sbt ◦ Build load time ◦ Ivy resolution time What can we do about the inefficiencies that are intrinsic to sbt?
  30. Triplequote new mission • Make Scala builds fast! ◦ We

    are going to own build time not just compilation time • We are working on a Triplequote sbt distribution ◦ Blazing fast startup time ◦ Faster Ivy resolution ◦ More! • Just like with Hydra, this distribution will be free for OSS development • After sbt we will tackle other build tools too • Interested? Stay tuned for updates and follow us on twitter @triple_quote Limited offer in time: 50% discount on all Hydra yearly licenses! Use the coupon SCALAWORLD2019 (expires in 7 days)