Upgrade to Pro — share decks privately, control downloads, hide ads and more …

npm-miner: An Infrastructure for Measuring the Quality of the npm Registry

npm-miner: An Infrastructure for Measuring the Quality of the npm Registry

Presented as a data showcase in Mining Software Repositories (MSR) 2018

More Decks by Kyriakos Chatzidimitriou

Other Decks in Research



    REGISTRY Kyriakos C. Chatzidimitriou, Michail Papamichail, Themistoklis Diamantopoulos, Michail Tsapanos, and Andreas L. Symeonidis Electrical and Computer Engineering Department Aristotle University of Thessaloniki, Greece
  2. NPM-REGISTRY & JS • Atwood’s law: ‘Any application that can

    be written in JavaScript, will eventually be written in JavaScript’ • Front-end: Angular, React, Vue • Back-end: node • Desktop: electron • Mobile: React-Native, NativeScript • IoT: Node-RED • ML: https://js.tensorflow.org/ • Believed to be one of the 3 components of the JS revolution (along with project templates and web components) 29 May 2018 MSR 2018, GOTHENBURG 2 551/day RedMonk Jan 2018: # GiHub Projects ~ # of SO tags Modulecounts.com: # packages since 2010
  3. MOTIVATION • Create a dataset/web application to be used: o

    As a benchmark dataset for deriving a quality model for npm packages o Research on user perceived quality: stars/forks/downloads vs. measured quality o Recommendation engine for package selection ▪Sturgeon’s law: 90% of everything is crap 29 May 2018 MSR 2018, GOTHENBURG 3
  4. 29 May 2018 MSR 2018, GOTHENBURG 4 Continuous replication of

    the npm registry Continuous analysis of npm packages JSSA JavaScript Static Analyzer https://github.com/AuthEceSoftEng/msr-2018-npm-miner Mashup from: • Our analysis • npms.io • GitHub NPM-MINER ARCHITECTURE
  5. FINDINGS Typical problems encountered: 1. The declared GitHub URL of

    a certain npm package leads to a not found page, 2. The declared GitHub URL of a certain npm package redirects to a different (to the one declared) GitHub page (probably this is a maintenance issue of the package.json file), 3. Many npm packages contain copied-pasted popular open source projects with only the package name changed in the package.json file 4. Many npm packages share the same GitHub repository and finally 5. Sometimes the JSSA fails or does not complete due to large or unexpected input 29 May 2018 MSR 2018, GOTHENBURG 5 All extremes can be found… Metric Min Max Mean Std Cycl Comp 1 288.5 1.67 0.62 Maint Index -119.33 171 125 16 Eslint issues 0 767K 181.63 2,7K Nsp issues 0 162 0.52 2.7 # of files 1 999 7.58 27 # of lines 1 2,5M 1,577 15,4K # of dep 0 652 4.01 7.39
  6. THANK YOU URL: http://npm-miner.com  Web app  Paper 

    Dataset GitHub Repo: https://github.com/AuthEceSoftEng/msr-2018-npm- miner  For dataset recreation Contact: [email protected] 29 May 2018 MSR 2018, GOTHENBURG 6