relationships between software publications
and software systems
David W Hogg
(NYU) (MPIA) (Flatiron)
twitter:@davidwhogg GitHub:davidwhogg
Slide 2
Slide 2 text
example
3x3 quadratic-fit method for centroiding stars in the SDSS
photometric pipeline
Lupton et al, 2000-ish; Vakili & Hogg, arXiv:1610.05873
Slide 3
Slide 3 text
three related publishable entities
a software system or pipeline
a publication primarily about that system
(pure software publication)
a “science” paper that also describes the system
(paper with a software component)
Slide 4
Slide 4 text
citation of software
cite the source code (eg, via ASCL)
cite a pure software publication
cite a paper with a software component
Slide 5
Slide 5 text
benefits of writing a pure software publication
serves as discoverable documentation
preserves ideas in the code
legitimizes the code; distinguishes it from competitors
gets you refereed citations for your software work
Slide 6
Slide 6 text
example
our emcee paper (2013 PASP 125, 306) has >1000 citations and
is getting 40 per month
Slide 7
Slide 7 text
costs of writing a pure software publication
takes time and money
produces a publication that some don’t respect
distracts from other scientific or career goals
(this is an opportunity cost)
Slide 8
Slide 8 text
time scales
different costs and benefits accrue on different time scales
eg, the time cost and the documentation benefit accrue
immediately
eg, the preservation of ideas benefit and the opportunity
cost accrue over very long time scales
Slide 9
Slide 9 text
#LTFDFCF
how you weigh costs and benefits depends strongly on your
career stage
and also the importance (to yourself) of your software work
Slide 10
Slide 10 text
random thought: preservation of software
preservation of installable executable
preservation of source code
preservation of fundamental ideas
Slide 11
Slide 11 text
the world is changing
respect for pure software contributions is much, much higher
than it was even 5 years ago (let alone 10 or 20)
(Thank you, AAS community:
Some of my students bet their careers on this.)
Slide 12
Slide 12 text
benefits of writing a paper with a software component
preserves ideas in the code, or some of them
legitimizes the code; distinguishes it from competitors
refereed citations for your software work
situates the software in an appropriate scientific context
Slide 13
Slide 13 text
appropriate scientific context
(I know this isn’t a talk about open-source software, but:)
one of the biggest costs of releasing a software system is time
spent batting down inappropriate uses
Slide 14
Slide 14 text
costs of writing a paper with a software component
takes time and money
(probably) doesn’t serve the documentation value of a pure
software publication
Slide 15
Slide 15 text
double jeopardy
I don’t like the idea of refereeing the software systems in
addition to the papers about them.
I don’t like the statistical editor comments on data analysis
papers submitted to the AAS Journals.
Slide 16
Slide 16 text
benefits of having no traditional publication
save time and money; no refereeing headaches either
citation of the software is direct citation of the relevant work
the ASCL and ADS and GitHub have made all this very easy
Slide 17
Slide 17 text
costs of having no traditional publication
documentation remains necessary
citations are (currently) unrefereed (and less respected,
perhaps)
uncertain long-term value of citations; preservation issues
hard to distinguish your software system from competitors
Slide 18
Slide 18 text
intangible value of the literature
preservation, ultra-long-term usefulness, interdisciplinarity,
criticism, remixing
Slide 19
Slide 19 text
advice
when you release software, explicitly state how you want it
cited, alongside or very near the license statement
Slide 20
Slide 20 text
conclusions
for me, the benefits of producing traditional publications are
legion (though sometimes a bit intangible)
both pure software publications and papers with a software
component are very valuable
the (new) direct cite-ability of software does not much diminish
these benefits