Slide 1

Slide 1 text

When Writing It Down is Not Enough the Era of Computational Notebooks In [3]: In [1]: Neil Ernst Computer Science University of Victoria neilernst.net In [2]: 1

Slide 2

Slide 2 text

About Me !2 Undergrad in Geography/GIS PhD in formal models of software 4 years as software architecture consultant + researcher Currently assistant professor in CS at UVic Study software design and use in the “machine learning” era Extensively study and use notebooks in my research

Slide 3

Slide 3 text

!3 Notebooks Through the Ages

Slide 4

Slide 4 text

!3 Notebooks Through the Ages Leonardo da Vinci

Slide 5

Slide 5 text

!4

Slide 6

Slide 6 text

!5

Slide 7

Slide 7 text

!5 Isaac Newton

Slide 8

Slide 8 text

!6

Slide 9

Slide 9 text

!6 Charles Darwin

Slide 10

Slide 10 text

!7

Slide 11

Slide 11 text

!7 Marie Curie

Slide 12

Slide 12 text

!8

Slide 13

Slide 13 text

!8 Rosalind Franklin

Slide 14

Slide 14 text

!9

Slide 15

Slide 15 text

!9 Margaret Mead

Slide 16

Slide 16 text

!10 Constraints

Slide 17

Slide 17 text

Describe: A temporal record of progress !11 Scientific Notebooks:

Slide 18

Slide 18 text

Explore: a record of key parameters and ‘knob positions’ !12 Scientific Notebooks:

Slide 19

Slide 19 text

Explain: a shareable repository for replication !13 Scientific Notebooks:

Slide 20

Slide 20 text

Afford: portable, cheap and easy to use !14 Scientific Notebooks:

Slide 21

Slide 21 text

But … !15

Slide 22

Slide 22 text

Fragile !16

Slide 23

Slide 23 text

They burn !17 The Burning of the Library of Alexandria’, by Hermann Goll (1876)

Slide 24

Slide 24 text

Idiosyncratic and personal !18

Slide 25

Slide 25 text

Handwriting dependent !19

Slide 26

Slide 26 text

Static and immutable !20

Slide 27

Slide 27 text

!21

Slide 28

Slide 28 text

What is a computational notebook? !22

Slide 29

Slide 29 text

!23

Slide 30

Slide 30 text

!23

Slide 31

Slide 31 text

!24 http://www.apkc.net/external/msc_578d_dm_project.html

Slide 32

Slide 32 text

!24 http://www.apkc.net/external/msc_578d_dm_project.html

Slide 33

Slide 33 text

!25

Slide 34

Slide 34 text

http://intro.syzygy.ca/ !26

Slide 35

Slide 35 text

R Notebooks !27

Slide 36

Slide 36 text

!28 Computational Notebooks

Slide 37

Slide 37 text

Descriptive narrative + execution results !28 Computational Notebooks

Slide 38

Slide 38 text

Descriptive narrative + execution results Interactive rapid feedback on changes* !28 Computational Notebooks

Slide 39

Slide 39 text

Descriptive narrative + execution results Interactive rapid feedback on changes* Connected huge ecosystem of relevant libraries !28 Computational Notebooks

Slide 40

Slide 40 text

Balancing exploration and engineering !29

Slide 41

Slide 41 text

source !30 “most researchers are never taught the equivalent of basic lab skills for research computing” —Software Carpentry

Slide 42

Slide 42 text

source !30 “most researchers are never taught the equivalent of basic lab skills for research computing” —Software Carpentry

Slide 43

Slide 43 text

(Data) Science is about creating software! !31

Slide 44

Slide 44 text

Software Engineering !32 Data Science Hiring Analytics

Slide 45

Slide 45 text

Software Engineering !32 Data Science Examine years of hiring outcomes Hiring Analytics

Slide 46

Slide 46 text

Software Engineering !32 Data Science Examine years of hiring outcomes Hiring Analytics

Slide 47

Slide 47 text

Software Engineering !32 Data Science Examine years of hiring outcomes Apply algorithm to filter resumes Hiring Analytics

Slide 48

Slide 48 text

Software Engineering !32 Data Science Examine years of hiring outcomes Apply algorithm to filter resumes Hiring Analytics

Slide 49

Slide 49 text

It penalized resumes that included the word “women’s,” as in “women’s chess club captain.” And it downgraded graduates of two all-women’s colleges !33

Slide 50

Slide 50 text

Some Engineering Problems with Notebooks !34

Slide 51

Slide 51 text

!35 paper-2-v3-ne-final.docx Longevity and version control

Slide 52

Slide 52 text

!35 paper-2-v3-ne-final.docx Longevity and version control

Slide 53

Slide 53 text

!36 Testing and modularity

Slide 54

Slide 54 text

!37 Notebook carpentry

Slide 55

Slide 55 text

Software Engineering !38 Data Science

Slide 56

Slide 56 text

Software Engineering !38 Data Science Versioned Testable Modular Documented

Slide 57

Slide 57 text

Software Engineering !38 Data Science Versioned Testable Modular Documented Interactive Exploratory Descriptive Connected

Slide 58

Slide 58 text

Software Engineering !38 Data Science Versioned Testable Modular Documented Interactive Exploratory Descriptive Connected

Slide 59

Slide 59 text

→ Computational notebooks remain personal and messy and individual → Tradeoff “exploration vs engineering” In [3]: Neil Ernst University of Victoria neilernst.net In [2]: 39