Slide 30
Slide 30 text
Analysis Pipeline
• unsupported projects (non-SBT projects or projects not
supported by ScalaMeta)
• duplicate projects (using Déjàvu1)
• uninteresting projects (less than 2 commits, less 2 months active)
1Lopes, Cristina V., Petr Maj, Pedro Martins, Vaibhav Saini, Di Yang, Jakub Zitny, Hitesh Sajnani, and Jan Vitek. "DéjàVu: a map of code duplicates on GitHub.", OOPSLA 2017
37% of the code base was
in 102 repositories
containing copies of
Apache Spark (biggest
Scala project, 100K+ SLOC)
Threw away over half of
the projects, half of the
code, yet kept 97% of
GitHub stars
FILTERED OUT: