Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Measuring Reproducibility in Computer Systems Research

Measuring Reproducibility in Computer Systems Research

Reading Group 2014

Emir Muñoz

May 21, 2014
Tweet

More Decks by Emir Muñoz

Other Decks in Research

Transcript

  1. Measuring Reproducibility in Computer Systems Research Emir Muñoz National University

    of Ireland Galway Christian Collberg, Todd Proebsting, Gina Moraila, Akash Shankaran, Zuoming Shi, Alex M Warren http://reproducibility.cs.arizona.edu/
  2. 2 Reproducibility is the ability of an entire experiment or

    study to be reproduced, either by the researcher or by someone else working independently. DEFINITION One of the main principles of the scientific method.
  3. 3 “Unwillingness or inability to share ones work with fellow

    researchers hampers the progress of science and leads to needless replication of work and the publication of potentially flawed results.”
  4. • Cliché phrases? • 613 papers with practical orientation from:

    – 8 ACM Conferences: • ASPLOS’12, CCS’12, OOPSLA’12, OSDI’12, PLDI’12, SIGMOD’12, SOSP’11, VLDB’12 – 5 Journals • TACO’9, TISSEC’15, TOCS’30, TODS’37, TOPLAS’34 4 EXPERIMENT “Our approach can be applied on ...” “Our implementation can be found at ...” “... we implemented out approach” “code and data can be downloaded from our website”
  5. 5 Can a CS student build the software within 30

    minutes, including finding and installing any dependent software and libraries, and without bothering the authors? Image source: http://jazzadvice.com/
  6. • [Vandewalle et at. 2009] distinguish six degrees of reproducibility:

    – 5: The results can be easily reproduced by an independent researcher with at most 15 min of user effort, requiring only standard, freely available tools (C compiler, etc.). – 4: The results can be easily reproduced by an independent researcher with at most 15 min of user effort, requiring some proprietary source packages (MATLAB, etc.). – 3: The results can be reproduced by an independent researcher, requiring considerable effort. – 2: The results could be reproduced by an independent researcher, requiring extreme effort. – 1: The results cannot seem to be reproduced by an independent researcher. – 0: The results cannot be reproduced by an independent researcher. 6 PREVIOUS EXERCISES
  7. • [Stodden 2010] reports about 638 registrants at the NIPS

    machine learning conf. – Why we don’t share the code? 7 PREVIOUS EXERCISES “The time it takes to clean up and document for release” “Dealing with questions from users about the code” “The possibility that your code may be used without citation” “The possibility of patents, or other IP constraints” “Competitors may get an advantage”
  8. 9.8% 17.4% 26.0% 34.4% 44.4% 0.0% 5.0% 10.0% 15.0% 20.0%

    25.0% 30.0% 35.0% 40.0% 45.0% 50.0% Reproducibility 12 RESULTS
  9. • The National Science Foundation’s (NFS) Gran Policy Manual states

    that: – Investigators are expected to share with other researchers... – Investigators and grantee are encouraged to share software and inventions... – ... Responsibility that investigators and organizations have as members of the scientific and engineering community, to make results, data and collections available to other researchers. • Industry – Papers with only authors from industry have a low rate or reproducibility 13 RESULTS
  10. 14 Image source: www.funnyjunk.com • Versioning Problems • Code Will

    be Available Soon • Programmer Left • Bad Backup Practices • Commercial Code • Proprietary Academic Code • Unavailable Subsystems • Multiple Reasons • Intellectual Property • Research vs. Sharing • Security and Privacy • Poor Design • Too Busy to Help So, What Were Their Excuses?
  11. 15 RESULTS Attached is the (system) source code of our

    algorithm. I’m not very sure whether it is the final version of the code used in our paper, but it should be at least 99% close. Thank you for your interest in our work. Unfortunately the current system is not mature enough at the moment, so it’s not yet publicly available... I am afraid that the source code was never released. The code was never intended to be released so is not in any shape for general use. (STUDENT) was a graduate student in our program but he left a while back so I am responding instead... Thanks ... Unfortunately, the server in which my implementation was stored had a disk crash in April and three disks crashed simultaneously... The code is owned by (COMPANY), ...is not open-source...You best bet is to reimplement :( Sorry ...sources are not meant to be opensource..I do not have the liberty of making available The source code at my current institution (UNIVERSITY)...
  12. 16 Most importantly, I do not have the bandwidth to

    help anyone come up to speed on this stuff. RESULTS
  13. 17

  14. • Conferences to require the code along with every paper

    submitted • Build special tools that can run reliably and with reproducible results • Build web sites that allow authors to make their code available to colleagues • Do not follow the bad habits like “publish and forget” style of scientific research 19 RECOMMENDATIONS
  15. 1. Unless you have compelling reasons not to, plan to

    release the code. 2. Students will leave, plan for it. 3. Create permanent email addresses. 4. Create project websites. 5. Use a source code control system. 6. Backup your code. 7. Resolve licensing issues. 8. Keep your promises. 9. Plan for longevity. 10. Avoid cool but unusual design. 11. Plan for Reproducible Releases. 21 LESSONS LEARNED
  16. 22

  17. 25 Rule 1: For Every Result, Keep Track of How

    It Was Produced Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  18. 26 Rule 2: Avoid Manual Data Manipulation Steps Ten Simple

    Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  19. 27 Rule 3: Archive the Exact Versions of All External

    Programs Used Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  20. 28 Rule 4: Version Control All Custom Scripts Ten Simple

    Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  21. 29 Rule 5: Record All Intermediate Results, When Possible In

    Standardized Formats Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  22. 30 Rule 6: For Analyses That Include Randomness, Note Underlying

    Random Seeds Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  23. 31 Rule 7: Always Store Raw Data behind Plots Ten

    Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  24. 32 Rule 8: Generate Hierarchical Analysis Output, Allowing Layers of

    Increasing Detail to Be Inspected Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  25. 33 Rule 9: Connect Textual Statements to Underlying Results Ten

    Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  26. 34 Rule 10: Provide Public Access to Scripts, Runs, and

    Results Ten Simple Rules for Reproducible Computational Research http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003285
  27. 35

  28. • As a discipline, we are a long way from

    reproducing research that is always, and completely, reproducible. • To share may increase the probabilities of citation. • The sharing specifications will have a positive effect on researchers’ willingness to share. • Sharing specifications can be used as a contract between authors and readers. 36 CONCLUSION
  29. • Data Quality and Trustworthiness – How close is this

    data to the real-world? – Can I trust in this data? 37 HOW THIS IS RELATED TO MY PHD Data is The New (Black) Gold
  30. • Data Replication & Reproducibility – http://www.sciencemag.org/site/special/data-rep/ • Getting Results

    from Testing by Laura Dillon (ACM Distinguished Speakers Program) – http://dsp.acm.org/view_lecture.cfm?lecture_id=108 • Why You Should Share Your Musical Knowledge – http://jazzadvice.com/why-you-should-share-your- musical-knowledge/ • Reproducible Research in Signal Processing – http://rr.epfl.ch/17/1/VandewalleKV09.pdf 38 FURTHER LITERATURE
  31. • RunMyCode enables scientists to openly share the code and

    data that underlie their research publications – http://www.runmycode.org/ • Executable Papers – http://executablepapers.com/ • CDE: Automatically create portable Linux applications (i.e., package, deliver, run). – http://www.pgbovine.net/cde.html 39 FURTHER LITERATURE
  32. • VLDB Guidelines – http://www.vldb.org/2013/experimental_reprodu cibility.html • Data Package Management

    – http://dat-data.com/ – https://github.com/maxogden/dat • Data Dryad – http://datadryad.org/ 40 FURTHER LITERATURE