Upgrade to Pro — share decks privately, control downloads, hide ads and more …

npm-miner: An Infrastructure for Measuring the Quality of the npm Registry

npm-miner: An Infrastructure for Measuring the Quality of the npm Registry

Presented as a data showcase in Mining Software Repositories (MSR) 2018

More Decks by Kyriakos Chatzidimitriou

Other Decks in Research

Transcript

  1. NPM-MINER
    AN INFRASTRUCTURE FOR MEASURING THE
    QUALITY OF THE NPM REGISTRY
    Kyriakos C. Chatzidimitriou, Michail Papamichail, Themistoklis Diamantopoulos, Michail Tsapanos, and Andreas L. Symeonidis
    Electrical and Computer Engineering Department
    Aristotle University of Thessaloniki, Greece

    View full-size slide

  2. NPM-REGISTRY & JS
    • Atwood’s law: ‘Any application that can
    be written in JavaScript, will eventually be
    written in JavaScript’
    • Front-end: Angular, React, Vue
    • Back-end: node
    • Desktop: electron
    • Mobile: React-Native, NativeScript
    • IoT: Node-RED
    • ML: https://js.tensorflow.org/
    • Believed to be one of the 3 components
    of the JS revolution (along with project
    templates and web components)
    29 May 2018 MSR 2018, GOTHENBURG 2
    551/day
    RedMonk Jan 2018: # GiHub Projects ~ # of SO tags
    Modulecounts.com: # packages since 2010

    View full-size slide

  3. MOTIVATION
    • Create a dataset/web application to be used:
    o As a benchmark dataset for deriving a quality model for npm packages
    o Research on user perceived quality: stars/forks/downloads vs. measured quality
    o Recommendation engine for package selection
    ▪Sturgeon’s law: 90% of everything is crap
    29 May 2018 MSR 2018, GOTHENBURG 3

    View full-size slide

  4. 29 May 2018 MSR 2018, GOTHENBURG 4
    Continuous replication of the npm registry
    Continuous analysis of npm packages
    JSSA
    JavaScript Static Analyzer
    https://github.com/AuthEceSoftEng/msr-2018-npm-miner
    Mashup from:
    • Our analysis
    • npms.io
    • GitHub
    NPM-MINER ARCHITECTURE

    View full-size slide

  5. FINDINGS Typical problems encountered:
    1. The declared GitHub URL of a certain npm
    package leads to a not found page,
    2. The declared GitHub URL of a certain npm
    package redirects to a different (to the one
    declared) GitHub page (probably this is a
    maintenance issue of the package.json file),
    3. Many npm packages contain copied-pasted
    popular open source projects with only the
    package name changed in the package.json file
    4. Many npm packages share the same GitHub
    repository and finally
    5. Sometimes the JSSA fails or does not complete
    due to large or unexpected input
    29 May 2018 MSR 2018, GOTHENBURG 5
    All extremes can be found…
    Metric Min Max Mean Std
    Cycl Comp 1 288.5 1.67 0.62
    Maint Index -119.33 171 125 16
    Eslint issues 0 767K 181.63 2,7K
    Nsp issues 0 162 0.52 2.7
    # of files 1 999 7.58 27
    # of lines 1 2,5M 1,577 15,4K
    # of dep 0 652 4.01 7.39

    View full-size slide

  6. THANK YOU
    URL: http://npm-miner.com
     Web app
     Paper
     Dataset
    GitHub Repo: https://github.com/AuthEceSoftEng/msr-2018-npm-
    miner
     For dataset recreation
    Contact: [email protected]
    29 May 2018 MSR 2018, GOTHENBURG 6

    View full-size slide