Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scientific Software and the Open Collaborative Web

Arfon Smith
November 17, 2013

Scientific Software and the Open Collaborative Web

Practices vary between scientific domains but all too often the sharing of research software is done on an ad hoc basis between individuals and with little thought about the wider community. With code and computation routinely forming the backbone of many academic endeavours we need to focus on publishing *all* of the products of our research – papers, software, data and provenance. In this talk I will highlight some promising examples from across academia and discuss how software development in the sciences could benefit from the collaboration norms of a modern open source project.

Arfon Smith

November 17, 2013
Tweet

More Decks by Arfon Smith

Other Decks in Science

Transcript

  1. Scientific Software
    and the
    Open Collaborative Web
    Arfon Smith
    @arfon

    View Slide

  2. http://www.flickr.com/photos/blachswan

    View Slide

  3. http://www.flickr.com/photos/esoastronomy/

    View Slide

  4. http://www.flickr.com/photos/esoastronomy/
    http://www.flickr.com/photos/jamiegilbert

    View Slide

  5. http://amandabauer.blogspot.com/

    View Slide

  6. View Slide

  7. View Slide

  8. Diffraction grating
    Telescope
    Detector

    View Slide

  9. View Slide

  10. View Slide

  11. View Slide

  12. View Slide

  13. View Slide

  14. 130 130 1 2048
    189 189 258 258
    480 562 378 378
    493 521 390 397
    851 851 247 274
    319 319 304 580
    493 511 610 636
    188 188 228 228
    > cat bad_pix_mask.txt

    View Slide

  15. Wasteful

    View Slide

  16. Wasteful
    2 days work
    !

    View Slide

  17. Wasteful
    2 days work
    3 observing runs/week
    !

    View Slide

  18. Wasteful
    2 days work
    3 observing runs/week
    52 weeks in year

    View Slide

  19. Wasteful
    2 days work
    3 observing runs/week
    52 weeks in year
    15 year detector lifetime

    View Slide

  20. Wasteful
    2 days work
    3 observing runs/week
    52 weeks in year
    15 year detector lifetime
    !
    2*3*52*15 = 4680 days (13 years)

    View Slide

  21. Wasteful… but the norm
    2 days work
    3 observing runs/week
    52 weeks in year
    15 year detector lifetime
    !
    2*3*52*15 = 4680 days (13 years)

    View Slide

  22. View Slide

  23. Why?

    View Slide

  24. We don’t know any different

    View Slide

  25. We’re taught to focus on
    research not tools

    View Slide

  26. We don’t act any different

    View Slide

  27. Products of research, not
    software valued

    View Slide

  28. Not treating software as first
    class research objects

    View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. “publishing a paper about
    code is basically just
    advertising”
    David Donoho
    http://www.stanford.edu/~vcs/Video.html

    View Slide

  34. How is the Open Source
    community doing it?

    View Slide

  35. Culture of reuse

    View Slide

  36. Low friction collaboration

    View Slide

  37. View Slide

  38. The pull request

    View Slide

  39. Code first, permission later

    View Slide

  40. View Slide

  41. View Slide

  42. View Slide

  43. View Slide

  44. View Slide

  45. View Slide

  46. View Slide

  47. “open source is…
    reproducible by necessity”
    Fernando Perez
    http://blog.fperez.org/2013/11/an-ambitious-experiment-in-data-science.html

    View Slide

  48. Better at collaborating
    because they have to be

    View Slide

  49. The social web


    likes props
    favs

    View Slide

  50. The social web
    The collaborative web

    View Slide

  51. GitHub is about helping people
    build software together

    View Slide

  52. What’s happening today?

    View Slide

  53. Collaboration around code

    View Slide

  54. Unidata: geosciences

    View Slide

  55. View Slide

  56. View Slide

  57. View Slide

  58. View Slide

  59. View Slide

  60. Astropy: astronomy

    View Slide

  61. View Slide

  62. View Slide

  63. View Slide

  64. View Slide

  65. View Slide

  66. Collaborative authoring

    View Slide

  67. View Slide

  68. View Slide

  69. Collaborative teaching

    View Slide

  70. View Slide

  71. View Slide

  72. View Slide

  73. Collaborative data collection

    View Slide

  74. View Slide

  75. View Slide

  76. Towards Collaborative
    Versioned Science

    View Slide

  77. How do we make this
    behaviour the norm?

    View Slide

  78. Incentive model

    View Slide

  79. Credit

    View Slide

  80. http://dx.doi.org/10.6084/m9.figshare.828487

    View Slide

  81. http://dx.doi.org/10.6084/m9.figshare.828487

    View Slide

  82. View Slide

  83. “publishing a paper about
    code is basically just
    advertising”
    David Donoho
    http://www.stanford.edu/~vcs/Video.html

    View Slide

  84. View Slide

  85. Derive meaningful metrics
    from open contributions

    View Slide

  86. “Academic environments of
    today do not reward tool
    builders”
    Ed Lazowska, OSTP event
    http://lazowska.cs.washington.edu/MS/MS.OSTP.pdf

    View Slide

  87. A VISION AND STRATEGY
    FOR SOFTWARE FOR
    SCIENCE, ENGINEERING,
    AND EDUCATION

    View Slide

  88. Establish a virtuous cycle#
    •  6%working%groups,%each%with%#
    •  3:6%faculty%from%each%ins;tu;on#
    14#
    http://lazowska.cs.washington.edu/MS/MS.OSTP.pdf

    View Slide

  89. http://lazowska.cs.washington.edu/MS/MS.OSTP.pdf
    Software Tools, Environments, and Support#
    •  SoSware%environments%and%tools%are%crucial%
    •  Organic,#sustainable,#reusable,#extensible#
    •  Easy#to#translate#across#problem#domains#
    •  The#creaHon#and#usage#of#today’s#tools#and#sojware#
    environments#are#distracHng#from#the#science#
    •  Today's%academic%environments%do%not%reward%tool%builders#
    •  How#can#the#development,#hardening,#sustaining,#sharing,#
    and#integraHon#of#techniques#into#a#reusable#sojware#
    infrastructure#be#recognized#and#incenHvized?#
    17#
    Example%approach:%Teams%of%soSware%architects,%engineers,%
    and%researchers%who%will%produce%data%science%tools%and%will%be%
    evaluated%on%the%impact%of%these%tools%

    View Slide

  90. What can we do today?

    View Slide

  91. Share more often

    View Slide

  92. If you’re going to share it then
    you better put a licence on it

    View Slide

  93. View Slide

  94. Share more often
    (no matter how small)

    View Slide

  95. 130 130 1 2048
    189 189 258 258
    480 562 378 378
    493 521 390 397
    851 851 247 274
    319 319 304 580
    493 511 610 636
    188 188 228 228
    > cat bad_pix_mask.txt
    > git clone [email protected]:arfon/aat/pixel_masks

    View Slide

  96. Thanks.
    [email protected]
    @arfon
    $

    View Slide