Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Compilation time: a bigger hammer

Triplequote
December 08, 2016

Compilation time: a bigger hammer

Scala compilation time is excessive. Projects in the range of millions of lines of code frequently experience compilation times of tens of minutes or even hours. Moreover, compilation times are unpredictable, depending on a combination of language features, external libraries and type annotations. A single line change may increase compilation times ten fold. Can we fix this? Come and find out!

Triplequote

December 08, 2016
Tweet

More Decks by Triplequote

Other Decks in Technology

Transcript

  1. Who are we? Iulian Dragos • Triplequote co-founder • 12

    years of Scala experience • Scala committer (#5) • Scala IDE committer (#1) • SIP committee member • Part of the founding Lightbend/Typesafe team Mirco Dotta • Triplequote co-founder • Have been using Scala since 2005 • Scala IDE committer (#3) • Contributed to Scala, Lagom, Play • Author of MiMa • Part of the founding Lightbend/Typesafe team
  2. Scala releases & compilation speed • Scala 2.12 released in

    November 2016 • Drastically reduced binary footprint! ◦ Runtime benefits! • Seems to have comparable performance wrt Scala 2.11 ◦ Miles reported a slowdown when compiling shapeless tests (~20% slowdown 2.11.8 vs 2.12.1) ◦ While Jason noticed moderate speedup on his benchmarks • Hard to assess compilation speed of Scala releases • Good news: The Scala team is working on automated benchmarking!
  3. Scala Resident Compiler • aka FSC (Fast Scala Compiler) •

    A lot of cycles are wasted on startup: ◦ JVM is cold and needs time to JIT ◦ We mostly work on one project at a time, but at the end of each compilation we throw away the whole symbol table ▪ scala-library.jar and rt.jar (JDK) are large and we hit the disk to read them every time [*] ▪ the rest of the classpath doesn’t change either! [*] The OS may be smarter than that
  4. Scala Resident Compiler • scalac becomes a compiler daemon •

    A light frontend feeds it sources to be compiled • The daemon reconciles the symbol table between runs
  5. Scala Resident Compiler • All symbol types are indexed by

    time ◦ Compilers are databases (Martin Odersky, Scala World 2016) • Time is (runId, phaseId) • An old symbol needs to be adapted to a new run ◦ Look up a corresponding symbol in the new run ◦ Not always possible due to inconsistencies
  6. Scala Resident Compiler • Pros: ◦ Huge time savings ◦

    Probably an upper bound on what scalac can cache! • Cons: ◦ Complex, on-demand logic for adapting symbols ◦ Fragile when code changes (class moves, renamed methods, etc) ▪ .. top-level classes and packages Dotty (probably) figured this out!
  7. Scala Presentation Compiler • Reuse resident compiler, but don’t generate

    code • Fast and mostly correct type-checker • Fiddle until the PC is happy, then build
  8. Scala Presentation Compiler • A background thread keeps compiling “loaded”

    files on change ◦ The old “resident” mode! • Files are added or removed from the set of maintained files • Additional requests may interrupt the typechecker ◦ Hyperlink ◦ Type members (completions) ◦ Get type at point • Asynchronicity hides the type-checker latency
  9. Scala Presentation Compiler • Powers Ensime and Eclipse Scala IDE

    • Pros: ◦ Fast! ◦ Correct “by design”: reuses the same type-checker as Scala ◦ When it works, it’s awesome! • Cons: ◦ Plagued by spurious, ghost errors ◦ Cumbersome API, easy to make mistakes ◦ Uses Trees, Types, Symbols from the compiler ▪ Strong coupling to a specific Scala version Scala.Meta will figure this out!
  10. • Sbt offers incremental compilation (a.k.a. Zinc) • Understanding how

    it works can make your code compile faster Incremental compilation
  11. Sbt incremental compilation explained • 3 phases are installed into

    the compiler pipeline • Track dependencies between source files • If the interface of classes, objects, traits in a file changes, all files dependent on that source must be recompiled • Algorithm: ◦ Recompile changed sources ◦ Repeat at most 3 times: Recompile dependencies of sources with API changes in prev step i. Full recompile if more than 50% of the sources are invalidated ◦ Recompile the files that transitively depend on sources with API changes in prev step
  12. • Make the return type of your public members explicit

    • Strive for as few class/trait/object declaration per source file (Sbt specific) • Update to Sbt 0.13.13! Today!! Incremental compilation tips
  13. Incremental compiler downside • The incremental compiler installs 3 phases

    • Additional phases increase compile time • Our experiments showed a compilation slowdown of 8 % when compiling dotty, and 16% on akka.
  14. Sbt + Modularization = Win? • Sbt can compile independent

    projects in parallel (multi-module build) • This works well if modules have no inter-dependencies • Problem: Often not the case, hence you are back to sequential compilation
  15. Hydra • Distributed Scala compiler ◦ Inspired by the simplicity

    of Spark • Completely transparent, support existing tools • Get a handle on build times ◦ Understand where time is spent ◦ Monitoring
  16. Distribute Scala compiler • Distribute files to workers in the

    cloud • Gather outputs (bytecode) • Cost vs. benefits: you add the network cost ◦ Network costs are small compared to compilation costs • Hard problem: ◦ Inter-file dependencies
  17. Distribute or parallelize? • What about the many cores we

    have? • What is the right granularity? ◦ Parallelize at the CompilationUnit level ◦ Parallelize the whole phase pipeline • Hard problems ◦ Inter-file dependencies between workers ◦ Laziness and mutable state add to the complexity • Solution? ◦ Same as for a distributed build: use different Global instances ◦ Use every opportunity to share
  18. Amdahl’s law How much speedup should we expect? p =

    percentage of parallelizable work s = speedup for parallelizable work (~ nr. of cores)
  19. Constant factor • Can we get that factor down? •

    Dmitry’s talk about Dotty performance ◦ Know your hardware and great things may happen! ▪ ~30x speedup potential by utilizing the CPU pipeline to the fullest If p goes down, speedup goes up!
  20. Triplequote Hydra Compiler • Based on Scalac ◦ We’re confident

    this won’t be a fork • Drop-in replacement ◦ Not tied to Sbt, command line available • Supporting Scala 2.11.7 or newer ◦ May support older versions based on demand ◦ Other forks™ should work too • Run the community build ◦ Compare binary output ..standing on the shoulders of giants
  21. Insights • Monitor the build ◦ Build times across builds,

    commits or files ◦ Other metrics coming ▪ implicit uses, type inference steps, etc ▪ GC load on workers, time spent blocking, etc.
  22. Hydrate. Build. • Using Hydra equates to adding one line

    to your Sbt build! addSbtPlugin("com.triplequote" % "sbt-hydra" % "1.0") • Run sbt tasks as usual • Hydra takes care of parallel compilation • Currently, Sbt support only • Support for other build tools (e.g., Gradle, Maven) based on demand
  23. Current status • We have a working prototype • Can

    compile the akka codebase (all modules!) Scala 2.11.7 Hydra (4 CPUs) Speedup 226s 150s 1.50x Sbt 0.13.13, Java 8, i7 2.3 GHz
  24. Giving back to the community • Push patches upstream to

    Scala and Sbt • Work well with Scala and Typelevel Scala
  25. Roadmap • Supporting Scala 2.12 • Benchmark more open source

    projects ◦ Community build • Private beta ◦ Invitation system ◦ Follow us on twitter @triple_quote for updates • Distributed compilation • Metrics insights / dashboard
  26. What now? • We are looking for a few reference

    customers who want Hydra asap • We will make using Hydra in your projects our priority • Interested? Grab us after the talk or [email protected]