Reproducible science: the good, the bad, the ugly and the untold

Ecdea9b9714877b86cee08458f085481?s=47 Tania Allard
September 13, 2019

Reproducible science: the good, the bad, the ugly and the untold

Ecdea9b9714877b86cee08458f085481?s=128

Tania Allard

September 13, 2019
Tweet

Transcript

  1. Tania Allard, PhD [she/her] @ixek Developer Advocate at Microsoft Reproducible

    science: the good, the bad, the ugly and the untold PyCon UK 2019 bit.ly/PyConUK-reproducibility
  2. 2 A bit about me Alan Turing Institute Industrial Fellow

    I am a recovering researcher Ex RSE – UK RSE Society Trustee JOSS editor @ixek
  3. 3 I spend a lot of time thinking about reproducible

    and open science @ixek
  4. 4 I spend a lot of time helping researchers to

    make research reproducible and open @ixek
  5. 5 I spend a lot of time helping data scientist

    to make machine learning reproducible and transparent @ixek
  6. https://doi.org/10.5281/zenodo.3332808 It’s a spectrum

  7. 7 https://doi.org/10.5281/zenodo.3332808 @ixek

  8. 8 https://doi.org/10.5281/zenodo.3332808 @ixek

  9. 9 https://doi.org/10.5281/zenodo.3332808 @ixek

  10. 10 https://doi.org/10.5281/zenodo.3332808 @ixek

  11. 11 https://doi.org/10.5281/zenodo.3332808 @ixek

  12. 12 Reproducible != Open @ixek

  13. THE GOOD

  14. None
  15. None
  16. 16 Software has changed the world @ixek

  17. 17 Software is changing the world… and research. @ixek

  18. 18 Software Sustainability Institute Survey – do researchers use software?

    @ixek https://slides.com/simonhettrick/why-recognising-scientific-software-experts-is-key-to-open-science#/2/1
  19. 19 Software Alliance @ixek https://slides.com/simonhettrick/why-recognising-scientific-software-experts-is-key-to-open-science#/2/1

  20. None
  21. 21 A stellar example NASA, https://flic.kr/p/tJbJf5.

  22. https://iopscience.iop.org/journal/2041-8205/page/Focus_on_EHT

  23. https://twitter.com/sweichwald/status/1116430285342695424

  24. 24 @ixek q First M87 Event Horizon Telescope Results. III.

    Data Processing and Calibration q A series of 6 papers published in April 2019 q Incredible long term international collaboration (200+ scientists, 60 institutes, 18 countries, 6 continents) https://doi: 10.3847/2041-8213/ab0c57
  25. Open source is EXTREMELY important for research

  26. 26 @ixek

  27. 27 @ixek Open tools and open infrastructure

  28. None
  29. THE BAD

  30. 30 What we say we want from research: - Innovation

    - Openness - Collaborations @ixek
  31. 31 @ixek

  32. 32 @ixek

  33. 33 What academia actually demands @ixek

  34. The research cycle - simplified @ixek

  35. The research cycle - simplified @ixek

  36. 36 How can research be transparent and reproducible if some

    parts are hidden? @ixek
  37. 37 An article about computational science in a scientific publication

    is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures Buckheit and Donoho (paraphrasing John Claerbout) Wavelab and Reproducible Research 1995 @ixek
  38. https://www.nature.com/news/1-500-scientists- lift-the-lid-on-reproducibility-1.19970 https://www.slideshare.net/JimGrange/the- reproducibility-crisis-in-psychological-science-one- year-later https://xkcd.com/882/

  39. 39 There is a reproducibility crisis! @ixek

  40. 40 There is a reproducibility chronic problem! @ixek

  41. THE UGLY

  42. @ixek

  43. Researchers won’t change practises even if invited to. 43

  44. Not without the right incentives

  45. 45 Let’s look at some examples @ixek

  46. 46 https://statmodeling.stat.columbia.edu/2013/04/16/memo-to-reinhart- and-rogoff-i-think-its-best-to-admit-your-errors-and-go-on-from-there @ixek

  47. 47 https://statmodeling.stat.columbia.edu/2013/04/16/mem o-to-reinhart-and-rogoff-i-think-its-best-to-admit-your- errors-and-go-on-from-there @ixek https://www.bbc.co.uk/news/magazine-22223190

  48. https://retractionwatch.com/2019/01/04/japanese-stem-cell-fraud- leads-to-a-new-retraction/ https://www.theguardian.com/science/2015/feb/18/haruko- obokata-stap-cells-controversy-scientists-lie

  49. https://retractionwatch.com/2019/08/13/doing-the- right-thing-psychology-researchers-retract-paper- three-days-after-learning-of-coding-error/

  50. THE UNTOLD

  51. Reproducibility is good right? Why would we be talking about

    it if not?
  52. 52 Why reproducibility matters @ixek Protects against bad actors Helps

    with correctness Helps ensure robustness Makes it easier to collaborate
  53. 53 Why reproducibility matters @ixek Leads to progress Enables strong

    baselines Is necessary for extensibility Makes you trustworthy
  54. https://doi.org/10.5281/zenodo.3332808

  55. 55 That means supporting open infrastructure and open software @ixek

  56. 56 To help with the last mile problem @ixek

  57. 57 https://doi.org/10.5281/zenodo.2747640

  58. 58 https://doi.org/10.5281/zenodo.2747640

  59. 59 https://doi.org/10.5281/zenodo.2747640

  60. 60 https://doi.org/10.5281/zenodo.2747640

  61. 61 https://doi.org/10.5281/zenodo.2747640

  62. 62 https://doi.org/10.5281/zenodo.2747640

  63. 63 Are we making the best use of our public

    infrastructure? @ixek 63
  64. 64 But the last mile is the hardest one! @ixek

  65. 65 People are the hardest part of reproducible science @ixek

  66. COMPETITIVE

  67. https://doi.org/10.5281/zenodo.3332808

  68. 68 Researchers are not developers… @ixek

  69. https://doi.org/10.5281/zenodo.3332808

  70. 70 @ixek Software engineering Research

  71. 71 @ixek Researcher Software engineer Software engineering Research

  72. 72 @ixek Researcher Software engineer Researcher developer Research Software Engineer

    Software engineering Research
  73. 73 Researchers are not data managers… @ixek

  74. https://doi.org/10.5281/zenodo.3332808

  75. 75 People are the hardest part of reproducible science reproducibility

    and software citation @ixek
  76. 76 @ixek Ø Citations are academic currency (whether they should

    be or not!) Ø They’re the best way we have to endorse good work. Ø We should be citing the software we use.
  77. None
  78. Fighting the good fight

  79. 79 How can we change research? Open tools and infrastructure-

    to make the last mile shorter Checklists, processes- no excuse that they did not know RSEs – no need to do it all alone Community - to advocate for change @ixek
  80. 80 @ixek Thank you Contact me: @ixek trallard@bitsandchips.me http://bit.ly/PyConUK-reproducibility

  81. 81 Credits and special thank yous @ixek • Chris Holdgraf

    and all of the Jupyter community https://speakerdeck.com/choldgraf/open-infrastructure-in-the-cloud-with- jupyterhub • Kirstie Whitaker and the Turing Way community https://doi.org/10.5281/zenodo.3238189 • Simon Hettrick and the RSE community & Society https://slides.com/simonhettrick/why-recognising-scientific-software- experts-is-key-to-open-science