Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to write a reproducible paper

Damien Irving
November 26, 2015

How to write a reproducible paper

Seminar at Monash University

Damien Irving

November 26, 2015
Tweet

More Decks by Damien Irving

Other Decks in Science

Transcript

  1. How to write a
    reproducible paper
    Damien Irving
    University of Melbourne

    View Slide

  2. Irving D, Simmonds I (2015). A novel approach
    to diagnosing Southern Hemisphere planetary
    wave activity and its influence on regional
    climate variability. Journal of Climate. 28,
    9041-57. doi:10.1175/JCLI-D-15-0287.1
    Irving D (in press). A minimum standard for
    publishing computational results in the weather
    and climate sciences. Bulletin of the American
    Meteorological Society.
    doi:10.1175/BAMS-D-15-00010.1

    View Slide

  3. The reproducibility crisis
    ž  Our field has rapidly transitioned to a
    computational science
    ž  Conventions around communicating our
    methods have hardly changed
    —  Have you ever seen a paper provide
    (ancillary) code/software details?
    ž  It’s impossible to replicate the results
    presented in journal papers today

    View Slide

  4. The crisis response
    ž  Funding agencies + journals1
    —  Some progress on dataset disclosure
    ○  Funders like NSF, ARC have policies
    ○  Most weather/climate journals have policies
    ○  Not consistently enforced
    —  Weak or non-existent code requirements
    ž  It’s not all their fault
    —  No examples to base new standards on
    —  I set about addressing this deficiency…
    1. Stodden et al. 2013. PLoS ONE, 8, e67111

    View Slide

  5. A plan for change
    1.  Consult the literature
    —  Why don’t people publish their code?
    —  Best practices for scientific computing
    2.  Devise and implement an approach
    —  Irving & Simmonds (2015)
    3.  Lobby journals
    —  Propose a communication standard (BAMS)
    —  Contact decision makers
    4.  Help scientists improve their skills
    —  Software Carpentry

    View Slide

  6. 1. The literature
    ž  Barriers to overcome1
    —  Perceived lack of time
    —  Low computational competency
    è minimise time and complexity
    ž  Computational best practice2
    —  Write scripts
    —  Modularise, don’t copy/paste -> code library
    —  Use version control
    1. Stodden (2010). doi:10.2139/ssrn.1550193
    2. Wilson et al. 2014. PLoS Biol, 12, e1001745

    View Slide

  7. ž  Add a computation section that contains:
    —  Brief overview of software packages
    ○  Academic credit for software authors
    —  Link to collection of supplementary materials:
    ○  Software description, code, log files
    ○  Host with journal, institution or Figshare, Zenodo
    2. The approach
    http://dx.doi.org/10.6084/m9.figshare.1385387

    View Slide

  8. Software Description
    ž  Name, version number, release date,
    institution and DOI or URL
    —  i.e. sufficient detail to recreate environment

    View Slide

  9. Code
    ž  [desirable] Link to version controlled
    repository at an external hosting service
    —  Allows for revision history, pull requests
    —  Your everyday repository is fine
    ○  github.com/DamienIrving/climate-analysis
    ž  [compulsory] Latest version of code
    —  With software description and log files

    View Slide

  10. Log files
    ž  Step-by-step account, download to result
    ž  My suggestion: the NCO / CDO approach
    —  Can generate timestamps with any language
    —  Features: Simple, read/writeable by anyone, easy
    to regenerate (no manual editing)

    View Slide

  11. 3. Lobby decision makers
    ž  Proposed minimum standard:
    —  Authors must include brief computation section
    which cites software and points to supplementary
    materials:
    ○  Software description
    ○  Code (suggest public, version controlled)
    ○  Log files
    —  Authors not obliged to provide assistance
    —  Reviewers only need to check availability
    —  Editorial discretion re code privacy

    View Slide

  12. ž  Next steps
    —  AMS Board on Data
    Stewardship
    —  Will you volunteer to
    try the approach for
    your next paper?
    https://drclimate.wordpress.com/2015/11/05/
    a-call-for-reproducible-research-volunteers/

    View Slide

  13. 4. Helping scientists
    ž  Software Carpentry
    —  AMOS Conference 2013-15
    —  Upcoming training: http://go.unimelb.edu.au/7cra
    ž  Content (two-days):
    —  Unix Shell
    —  Programming (in Python)
    —  Version control
    —  Workflow automation (with Make)
    —  Data management:
    damienirving.github.io/capstone-oceanography/

    View Slide

  14. 1-3 February, 2016
    resbaz.com

    View Slide

  15. Aim higher!
    ž  Minimum standard is reproducible, but
    not very comprehensible
    ž  Ideas:
    —  README files
    —  Write packages
    ○  e.g. eofs, windspharm, SkewT
    —  VisTrails / CWSLab workflow tool
    ○  http://cwslab.nci.org.au/
    —  Docker / RunMyCode.org
    —  Write the new workflow management tool?

    View Slide

  16. https://github.com/CWSL/cwsl-mas/wiki/Tutorial

    View Slide

  17. Why aim for reproducibility?
    ž  To literally reproduce results / catch
    people out
    —  Software evolves so quickly
    —  Most don’t have access to suitable hardware
    ž  To build on each other’s ideas faster
    —  The risk of people doing nothing with your
    work is much greater than the risk of being
    “scooped”

    View Slide

  18. Summary
    ž  There is a reproducibility crisis in weather/
    climate/ocean research
    ž  This can be solved by adding a brief
    computation section to papers which points
    to supplementary materials:
    —  Software description
    —  Code repository (public, version controlled)
    —  Log files
    ž  Journals could adopt this framework as a
    formal minimum standard

    View Slide

  19. Questions?
    https://drclimate.wordpress.com/
    orientation-guide/

    View Slide