Upgrade to Pro — share decks privately, control downloads, hide ads and more …

In Dependencies We Trust: How vulnerable are de...

In Dependencies We Trust: How vulnerable are dependencies in npm modules?

Talk @ Software Languages Lab, Department of Computer Science of the Vrije Universiteit Brussel (VUB)

Avatar for Joseph Hejderup

Joseph Hejderup

May 27, 2016
Tweet

Other Decks in Programming

Transcript

  1. ABOUT ME? Ehh... Born in this “peaceful” country.. strangely moved

    to.. ..in 2012 to pursue my M.Sc and.... ..still here... I enjoy... and... Scientific Programmer @
  2. SOFTWARE RE-USE No need to reinvent the wheel! <code>UI</code> <code>

    API </code> <code> Database </code> <code> Utilities </code> <code> statistics </code> <code> API </code> <code dep=A>UI</code> <code dep=B> Database </code> <code dep=C> Utilities </code> <code dep=D> statistics </code>
  3. SOFTWARE RE-USE An important pillar for modern software development! 90%

    use third-party open-source code [1] [1]: 2014 Sonatype Open Source Development and Application Security Survey
  4. SOFTWARE RE-USE Source: 2014 Sonatype Open Source Development and Application

    Security Survey 63% have no active monitoring of vulnerabilities! 70% have never banned the usage of a component! only 21% have policies to show secure 3rd party source code
  5. If there is negligence of security policies, how does centralized

    software repositories deal with security?
  6. 278,396 JavaScript modules 482 modules/day Largest centralized software repository! .....in

    comparison, Maven has 143,714 modules with growth of 132 modules/day Source: modulecounts.com
  7. Liberal screening process! • Any user can upload a module

    - no verification of the uploader! • Supplied content is not reviewed - no code quality check or monitoring
  8. JavaScript Most misunderstood language? var pos = document.URL.indexOf("user=")+5; var userInput

    = document.URL.substring(pos, document.URL.length); eval("document.write('" + decodeURI(userInput) + "');");
  9. WHERE CAN IT GO WRONG? CLIENT INPUT PRESENTATION SERVER INPUT

    VALIDATION OUTPUT SANITIZATION DATA anything clean extra clean Using vulnerable components in critical parts!
  10. What is the prevalence of modules that use at least

    one dependency that is disclosed as vulnerable? What is the cascading effect of modules depending on at least one vulnerable module? What is the time latency for updating to a non-vulnerable version range for a dependency? RQ1 RQ2 RQ3 RESEARCH QUESTIONS Answering it both Qualitatively & Quantitatively!
  11. deploys semantic versioning Caret-Range: ˆ1.2.3 (>= 1.2.3 < 1.3.0) Tilde-Range:

    ~1.2.3 (>= 1.2.3 < 2.0.0) X-Range: 1.2.x MAJOR.MINOR.PATCH SEMANTIC VERSIONING
  12. HOW DO YOU DETECT VULNERABLE DEP’S? All versions for bassmaster

    Advisory range <=1.5.1 These are vuln versions for bassmaster Declared dependency range “~0. 0.2” -> “>=0.0.2 <0.1.0” vuln versions for this module Will this resolve range resolve to a vuln version? var maxSatis = semver.maxSatisfying (all_versions_list, semver.range(~0. 0.2)); //0.0.2 Yes! PRE-STEP IDENTIFY VULN
  13. • Vulnerable range - the advanced syntax range (e.g caret/tilde

    range) will resolve to a vulnerable version! • Mixed range - contains both valid vulnerable and non- vulnerable versions in the range but will most likely resolve to a non-vuln version! TWO RANGES Food for thought: How should we view this?
  14. STUDY OF 22 NSP Advisories I studied 22 advisories, one

    year later there are 89 advisories!
  15. REGISTRY Snapshot taken on 12 October 2014 104,675 entries in

    the db 99,631 processed 61,508 use at least a dependency Graph database
  16. PRESENCE: npm 3.2% of the modules use the advisories as

    dependency 1.7% of the modules has a vulnerable version range 1.6% resolve to a vulnerable version LOW PRESENCE!
  17. PRESENCE: ADVISORIES 1517 1678 1/3 of the advisory modules are

    vulnerable! 1029 = no vulnerable version = resolve vulnerability = vulnerable version range
  18. TIME TO UPDATE 12 Oct 2014 Early 2013 Early 2014

    Mid 2013 Sep 2014 Reduction of vulnerable modules from the day of publication How long time did it take?
  19. REDUCTION OF VULNERABILITIES publication date vulnerable reduction js-yaml June 23,

    2013 54 -16.67% connect July 1, 2013 299 -11.71% validator-v1 July 5, 2013 48 -8.33% marked January 31, 2014 279 -14.34% st February 6, 2014 3 0.00% qs August 6, 2014 187 -20.32% send September 12, 2014 99 -11.11% Reduction from publication timestamp to 12 Oct 2014
  20. HOW MUCH TIME IS NEEDED? • Advisories released roughly 14

    months ago ◦ connect: 4 to 9 months ◦ js-yaml: 4.4 to 11 months ◦ validator-v1: 2 to 4.6 months • Advisories released roughly 8 months ago: ◦ hapi-v2: 57 to 84 days ◦ marked: 1.28 and 6.5 months ◦ st: 22.5 days to 84 days • Advisories that were released 3 months ago: ◦ hapi-v6: 13 and 44 days ◦ qs: 5.5 and 14 days ◦ send: 5 to 16 days Middle 50% From 12 Oct 2014
  21. 85 projects 4 top depend-on 31 GitHub Stars 18 downloads

    1 Github forks 31 random SELECTION OF MODULES
  22. Performed on five advisory modules • Shallow code review: can

    we trigger the vulnerability? • Cascading dependencies: is the vulnerability propagated? CODE INSPECTION > How effective is analyzing dependency declarations? We need to see that it is actually in the code!
  23. The advisory reports on a middleware that was not used

    in any of the inspected modules, is it wise to report the full module as vulnerable in dependency checkers? EXAMPLE: connect
  24. Restler depends qs for for all requests restler is used

    for http requests EXAMPLE: spotify-web-api CASCADING DEP
  25. • 21 of 85 modules have security discussions • 15

    of 21 are related to NSP advisories 38 SECURITY DISCUSSIONS WHY ARE THE DEPENDENCIES NOT UPDATED?
  26. Left-pad incident: • Immutable package management systems • Padding function

    as a dep? • We need to define what we can trust: contract between dev and repository! We are coming close to dependency smells!
  27. Still the same problems: • Better tooling, we need to

    be more creative to find novel solutions; ◦ Partly is static and dynamic analysis, but we need look beyond other means! ◦ Can data-driven approaches be leveraged to make better models for program understanding? • See Ali Mesbah’s SANER’16 Talk: Software Analysis for the Web
  28. TAKE AWAY TL;DR We need to start investigate more about

    our dependency hygiene! This is just an “initial study” - We need to investigate more challenges and problems with high dependency use - Development practises; how is risk calculated with using dependencies? e.g we have seen cases where there are security holes that are costly - How do we develop architectures that allows for fast patching or rank libraries for that matter? - This explores open-source projects; how does it look at industry projects?