Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Selfish reasons to carry out reproducible research

Dave Lunt
January 15, 2021

Selfish reasons to carry out reproducible research

Given 29 November 2019 at the Univerisity of Hull

Inspired by Markowetz F. Five selfish reasons to work reproducibly. Genome Biol. 2015;16: 274. doi:10.1186/s13059-015-0850-7

These Google slides are at https://bit.ly/35yDvIG
The slides created by me (almost all) can be considered CC0 public domain, use as you will. Some parts were created by others, I have put credit and copyright info in the speaker notes to the Google slides

Dave Lunt

January 15, 2021
Tweet

More Decks by Dave Lunt

Other Decks in Research

Transcript

  1. Selfish reasons to carry out
    reproducible research
    Dave Lunt
    [email protected]
    @davelunt
    https:/
    /bit.ly/35yDvIG

    View Slide

  2. What is
    reproducibility?

    View Slide

  3. first Why, then How

    View Slide

  4. It's required
    Why be reproducible?

    View Slide

  5. McNutt M. Journals unite for reproducibility. Science. 2014;346: 679. doi:10.1126/science.aaa1724

    View Slide

  6. View Slide

  7. View Slide

  8. RCUK – Statement of Expectations for
    Postgraduate Training
    Students should receive training in experimental design and
    statistics appropriate to their disciplines, and in the
    importance of ensuring research results are robust and
    reproducible

    View Slide

  9. View Slide

  10. View Slide

  11. It's the right thing
    to do
    It's science

    View Slide

  12. Ask not what you can
    do for reproducibility,
    but what
    reproducibility can
    do for you
    Florian Markowetz

    View Slide

  13. It will save you time and effort
    It will advance your career
    Selfish Reproducible Research

    View Slide

  14. Who here has
    tried to reproduce
    a published
    analysis?
    Who is most likely
    to reproduce your
    work?

    View Slide

  15. Do experiments work first
    time for you?

    View Slide

  16. “Future You” will be most likely
    person to reproduce your work

    View Slide

  17. Future You
    Previous You
    Previous You does not
    respond to emails

    View Slide

  18. It will
    greatly help
    “future you”
    Selfish reasons to
    carry out reproducible
    research

    View Slide

  19. How can we save time, effort?
    eg: make figures from scripts
    this is reproducible analysis

    View Slide

  20. Your research will be
    faster and easier (and
    better)

    View Slide

  21. The old way

    View Slide

  22. View Slide

  23. View Slide

  24. Automated reproducible
    Manual
    Cumulative total effort
    Number of repeats
    Yes you will cross
    this point

    View Slide

  25. Reproducibility makes it easier
    to write papers
    and respond to
    reviewers

    View Slide

  26. Reproducible research will save
    you time and effort Reuse and recycle data
    generation and analysis

    View Slide

  27. Errors are ubiquitous
    Retractions will hurt you
    Reproducibility helps your career

    View Slide

  28. Reproducibility will help
    your career
    Reputation
    Rigour
    New collaborators
    Rapid
    Agile
    Future-proof

    View Slide

  29. Choose a collaborator
    Rigorous, modern, open, with
    future-proof methods.
    Leading the way. Prepared
    and shared many of the
    methods you need already.

    View Slide

  30. Projects are not unique.
    How will you build your career?

    View Slide

  31. Required
    Helps “Future You”
    Easier & faster, agile
    Easier papers
    Helps your next project
    Builds your career
    Avoid major screw-ups
    Makes you a cool collaborator
    Selfish reasons to
    be reproducible

    View Slide

  32. Pause
    But what about ...?

    View Slide

  33. I’d rather do real science than tidy my data
    It's the way I’ve always done things, and I’ve got this far
    Excel is just fine
    My data and code are spread across
    many computers, I couldn’t do this
    I’ll sort this out at the end
    My field is too competitive, I can’t slow
    down to do this

    View Slide

  34. I’m not a
    computational
    biologist

    View Slide

  35. How?

    View Slide

  36. 1. relax, most problems are solved

    View Slide

  37. 2. think of it as training

    View Slide

  38. 3. celebrate the quick wins

    View Slide

  39. Quick win:
    Be part of a support
    community

    View Slide

  40. Make 1
    figure from
    a script
    Quick win

    View Slide

  41. Butterfly_project
    - DATA
    -raw_data
    -fig1_data
    - FIGURES
    -fig1.pdf
    -fig2.pdf
    -table1.md
    - RESULTS
    -PCA
    -lin_regr
    - SCRIPTS
    -fig1.py
    - README.txt
    Informative names
    Structured
    Text description of what is where
    Spend 1 morning
    to organise your
    data
    Quick win
    => Provenance and persistence

    View Slide

  42. 2. think of it as training

    View Slide

  43. Wilkinson et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3:
    160018. doi:10.1038/sdata.2016.18
    Records, Coding, Workflows, &
    Research Objects

    View Slide

  44. Make data open with a doi
    Findable
    Accessible
    Interoperable
    Reusable
    Quick win
    zenodo.org
    figshare.com
    osf.io
    For you (and for others)
    Yes, data can be private until you’re ready

    View Slide

  45. zenodo.org
    Yes you can keep
    data private until
    publication

    View Slide

  46. It's free osf.io

    View Slide

  47. osf.io

    View Slide

  48. View Slide

  49. File storage
    Integration of
    GDrive, Box,
    Dropbox, Git etc
    OSF cloud
    storage
    Everything in
    one place

    View Slide

  50. Activity
    All changes
    recorded with
    version control
    Roll back to
    previous
    versions
    Comments and
    collaborations

    View Slide

  51. Components
    are folders
    Structure and
    backup
    Robust sharing
    and privacy
    Can be
    published with
    doi

    View Slide

  52. try osf.io
    Easy to organise project
    Easy to store & publish data
    Easy to collaborate
    Easy reproducibility

    View Slide

  53. Making
    labwork
    reproducible
    protocols.io

    View Slide

  54. It's free

    View Slide

  55. Quick win
    METHODS SECTION
    Experimental procedures are briefly described here for context,
    and exact protocols and reagents are detailed in doi:1234567
    and doi:987654

    View Slide

  56. Summary

    View Slide

  57. It will save you time & effort
    Selfish reasons to be
    reproducible
    Write once and iterate, faster, helps with ms,
    helps with reviewers, don’t start projects from
    scratch- build on prior reproducibility

    View Slide

  58. It will advance your career
    Selfish reasons to be
    reproducible
    Fast, cutting edge, future-proof, you’ll look good,
    more collaborators, extra citations, avoid
    career-ending disasters, builds a group etc etc

    View Slide

  59. Do not try to be
    completely
    reproducible!
    Shocking finale
    PTO...

    View Slide

  60. Do not decide to be reproducible. Decide to be a bit more
    reproducible, celebrate the small wins. Spread the word.
    Take home message

    View Slide

  61. View Slide