Towards the Awesome - Making Collaborative Versioned Science a Reality

Towards the Awesome - Making Collaborative Versioned Science a Reality

My talk from 'Open Science: The Transparency Revolution' at NIAID Office of Cyber Infrastructure and Computational Biology (OCICB) annual Bioinformatics & Computational Biosciences Festival.

03e2e7de45b193cac192ae7ea071e5ff?s=128

Arfon Smith

April 08, 2014
Tweet

Transcript

  1. Towards the Awesome: Making Collaborative Versioned Science a Reality Arfon

    Smith @arfon Creative Commons Attribution 3.0 Unported License http://is.gd/NIAID
  2. http://www.flickr.com/photos/blachswan

  3. http://www.flickr.com/photos/esoastronomy/

  4. http://www.flickr.com/photos/esoastronomy/ http://www.flickr.com/photos/jamiegilbert

  5. http://amandabauer.blogspot.com/

  6. None
  7. None
  8. Diffraction grating Telescope Detector

  9. None
  10. None
  11. None
  12. None
  13. None
  14. 130 130 1 2048 189 189 258 258 480 562

    378 378 493 521 390 397 851 851 247 274 319 319 304 580 493 511 610 636 188 188 228 228 > cat bad_pix_mask.txt
  15. Wasteful

  16. Wasteful 2 days work !

  17. Wasteful 2 days work 3 observing runs/week !

  18. Wasteful 2 days work 3 observing runs/week 52 weeks in

    year
  19. Wasteful 2 days work 3 observing runs/week 52 weeks in

    year 15 year detector lifetime
  20. Wasteful 2 days work 3 observing runs/week 52 weeks in

    year 15 year detector lifetime ! 2*3*52*15 = 4680 days (13 years)
  21. Wasteful… but the norm 2 days work 3 observing runs/week

    52 weeks in year 15 year detector lifetime ! 2*3*52*15 = 4680 days (13 years)
  22. We don’t know any different

  23. We’re taught to focus on research not tools

  24. We don’t act any different

  25. (Paper) products of research, not software valued

  26. Not treating software as first class research objects

  27. !

  28. What is a GitHub?

  29. None
  30. None
  31. None
  32. None
  33. GitHub

  34. None
  35. None
  36. None
  37. How does it work?

  38. The pull request

  39. None
  40. Code first, permission later

  41. None
  42. None
  43. None
  44. None
  45. None
  46. None
  47. None
  48. None
  49. Better at collaborating because they have to be

  50. “open source is… reproducible by necessity” Fernando Perez http://blog.fperez.org/2013/11/an-ambitious-experiment-in-data-science.html

  51. ‘Open Source’ way of working

  52. (doesn’t have to mean this) Open Public? =

  53. Open (within your team, department or institution)

  54. Electronic

  55. Available

  56. Asynchronous, exposed process

  57. Lock-free

  58. Low friction collaboration

  59. What’s happening today?

  60. Collaboration around code

  61. None
  62. None
  63. None
  64. Collaborative authoring

  65. None
  66. None
  67. Collaborative teaching

  68. None
  69. None
  70. None
  71. Collaborative data collection

  72. None
  73. None
  74. Towards Collaborative Versioned Science

  75. How do we make this behaviour the norm?

  76. Credit

  77. None
  78. None
  79. None
  80. None
  81. “publishing a paper about code is basically just advertising” David

    Donoho http://www.stanford.edu/~vcs/Video.html
  82. None
  83. How to derive meaningful metrics from open contributions?

  84. None
  85. None
  86. Reproducibility Data intensive

  87. None
  88. None
  89. None
  90. Barriers are cultural, not technical

  91. “Academic environments of today do not reward tool builders” Ed

    Lazowska, OSTP event http://lazowska.cs.washington.edu/MS/MS.OSTP.pdf
  92. A VISION AND STRATEGY FOR SOFTWARE FOR SCIENCE, ENGINEERING, AND

    EDUCATION
  93. Establish a virtuous cycle# •  6%working%groups,%each%with%# •  3:6%faculty%from%each%ins;tu;on# 14# http://lazowska.cs.washington.edu/MS/MS.OSTP.pdf

  94. http://lazowska.cs.washington.edu/MS/MS.OSTP.pdf Software Tools, Environments, and Support# •  SoSware%environments%and%tools%are%crucial% •  Organic,#sustainable,#reusable,#extensible#

    •  Easy#to#translate#across#problem#domains# •  The#creaHon#and#usage#of#today’s#tools#and#sojware# environments#are#distracHng#from#the#science# •  Today's%academic%environments%do%not%reward%tool%builders# •  How#can#the#development,#hardening,#sustaining,#sharing,# and#integraHon#of#techniques#into#a#reusable#sojware# infrastructure#be#recognized#and#incenHvized?# 17# Example%approach:%Teams%of%soSware%architects,%engineers,% and%researchers%who%will%produce%data%science%tools%and%will%be% evaluated%on%the%impact%of%these%tools%
  95. What can you do today?

  96. http://www.flickr.com/photos/tamaleaver/

  97. Why are you sharing?

  98. Share more often

  99. If you’re going to share it then you better put

    a licence on it
  100. None
  101. Treat documentation as a first class entity

  102. Share more often (no matter how small)

  103. 130 130 1 2048 189 189 258 258 480 562

    378 378 493 521 390 397 851 851 247 274 319 319 304 580 493 511 610 636 188 188 228 228 > cat bad_pix_mask.txt > git clone git@github.com:arfon/aat/pixel_masks
  104. Thanks. arfon@github.com @arfon "