Slide 1

Slide 1 text

growing your inner data scientist: tips to success in data science, from questions to results Mine Çetinkaya-Rundel Duke University + RStudio 🔗 bit.ly/grow-ds-21 mine-cetinkaya-rundel [email protected] @minebocek

Slide 2

Slide 2 text

professor of statistics data scientist book author minebocek mine-cetinkaya-rundel mine citizenstatistician

Slide 3

Slide 3 text

what is data science?

Slide 4

Slide 4 text

datascience.berkeley.edu/about/what-is-data-science r4ds.had.co.nz/explore-intro.html oreilly.com/library/view/doing-data-science/9781449363871/ch01.html

Slide 5

Slide 5 text

data science is vague and evolving…

Slide 6

Slide 6 text

1 always be curious

Slide 7

Slide 7 text

keep informed books articles blogs

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

keep current

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

keep engaged conferences workshops meetups webinars

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

2improve your workflow

Slide 14

Slide 14 text

rstats.wtf

Slide 15

Slide 15 text

r4ds.had.co.nz

Slide 16

Slide 16 text

3share your output

Slide 17

Slide 17 text

David Robinson @drob @rstudio::conf 2019, The Unreasonable E ff ectiveness of Public Work Idea Published paper Preliminary results Draft manuscript Completed manuscript How I used to think of my goals: More valuable Less valuable Anything still on your computer Anything out in the world (Data, code, results, draft, finished paper) (Paper, preprint, product, blog post, open source, tweet) How I should have been thinking of them: More valuable Less valuable Idea Published paper Preliminary results Draft manuscript Completed manuscript How I used to think of my goals: More valuable Less valuable Anything still on your computer Anything out in the world (Data, code, results, draft, finished paper) (Paper, preprint, product, blog post, open source, tweet) How I should have been thinking of them: More valuable Less valuable

Slide 18

Slide 18 text

share the things you create

Slide 19

Slide 19 text

share the things you create big

Slide 20

Slide 20 text

datasciencebox.org datasciencebox.org

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

share the things you create little

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

share the things you learn

Slide 25

Slide 25 text

Mara Averick @dataandme EARL 2017, leaRning out loud SOMETIMES I GO ON TWITTER, AND I TEND TO LEARN OUT LOUD

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

# March 2019 library(tidyverse) ggplot(mtcars, aes(x = wt, y = mpg)) %>% geom_point() #> Error: `mapping` must be created by `aes()` #> Did you use %>% instead of +?

Slide 29

Slide 29 text

share your questions

Slide 30

Slide 30 text

Thiago Maciera “The Art of Problem Solving.” In Open Advice: FOSS: What We Wish We Had Known When We Started, edited by Lydia Pintscher, 55–61. THE MOST USELESS PROBLEM STATEMENT THAT ONE CAN FACE IS “IT DOESN’T WORK”, YET WE SEEM TO GET IT FAR TOO OFTEN.

Slide 31

Slide 31 text

TEN SIMPLE RULES FOR GETTING HELP FROM ONLINE SCIENTIFIC COMMUNITIES 1. Don’t be afraid to ask a question 2. State the question clearly 3. Learn established customs before posting 4. Don’t ask what has already been answered 5. Always use a good title 6. Do your homework before posting 7. Proofread your post 8. Be courteous to other forum members 9. Remember that the archive of your question can be helpful to others 10. Give back to the community Dall’Olio, Giovanni M., Jacopo Marino, Michael Schubert, Kevin L. Keys, Melanie I. Stefan, Colin S. Gillespie, Pierre Poulain, et al. 2011. “Ten Simple Rules for Getting Help from Online Scientific Communities.” PLoS Computational Biology 7 (9): 10–12. doi:10.1371/journal.pcbi.1002202.

Slide 32

Slide 32 text

suppose… # Goal: "1 a" "2 b" "3 c" "4 d" "5 e"

Slide 33

Slide 33 text

I’m trying to create the following vector in R: "1 a" "2 b" "3 c" "4 d" "5 e” So I define X to be 1 : 5 and Y to be the first 5 letters of the alphabet, but when I add them I get the following error. Error in x + y : non - numeric argument to binary operator 🤷 Q

Slide 34

Slide 34 text

I’m trying to create the following vector in R: "1 a" "2 b" "3 c" "4 d" "5 e" Below is a screenshot of what I tried. Why is it not working? 🤷 Q

Slide 35

Slide 35 text

library(reprex) Prepare reproducible examples for posting to GitHub issues, StackOverflow, or Slack snippets. writing good questions

Slide 36

Slide 36 text

I’m trying to create the following vector in R: "1 a" "2 b" "3 c" "4 d" "5 e” Below is what I tried. What does this error mean, and how can I fix it? 🤷 Q x < - 1 : 5 y < - letters[1 : 5] x + y #> Error in x + y: non - numeric argument to binary operator

Slide 37

Slide 37 text

4contribute to community

Slide 38

Slide 38 text

find open source projects you enjoy, and start contributing

Slide 39

Slide 39 text

contribute to books

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

contribute to packages

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

get the pulse of a project read the code contributing to oss watch the repo discuss your ideas make a pull request review CoC + contributing guide

Slide 45

Slide 45 text

5collaborate with others

Slide 46

Slide 46 text

collaborate on process

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

collaborate in class

Slide 49

Slide 49 text

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin vulputate feugiat lacus eu lobortis. Mauris dictum ultrices tortor sit amet tincidunt. Cras magna metus, volutpat eu tempus nec, mattis vel nibh. Vivamus eros tellus, lobortis id molestie quis, feugiat sed lorem. Proin quis pellentesque justo, vitae elementum eros. Proin orci ex, dignissim sed urna in, congue fringilla nisi. Aliquam id urna orci. Vestibulum consequat, enim et sodales finibus, nunc arcu condimentum odio, rhoncus venenatis ligula sem at lectus. Nullam nec porttitor nisl. Fusce hendrerit, mauris sed iaculis gravida, odio diam lacinia diam, bibendum dapibus metus mi imperdiet ex. Praesent ac urna scelerisque, condimentum est vitae, pellentesque erat. Integer sed hendrerit ex. Sed facilisis sollicitudin venenatis. Nulla fringilla lorem at metus maximus cursus. Morbi facilisis turpis at purus volutpat bibendum quis quis eros. Ut id odio interdum, luctus mauris nec, pharetra quam. Nullam velit risus, consectetur ac faucibus eu, lacinia ut neque. Pellentesque ut aliquet libero. Ut ac neque eget nunc hendrerit commodo. Maecenas vel ultrices augue. blog post portfolio entry competition submission …

Slide 50

Slide 50 text

collaborate outside class

Slide 51

Slide 51 text

John M. Chambers Statistical So ft ware Award 🔗 stat-computing.org/awards/jmc ASA StatComp Student Paper Competition 🔗 stat-computing.org/awards/student Kaggle: Prediction competition 🔗 kaggle.com/competitions

Slide 52

Slide 52 text

6broadcast your work

Slide 53

Slide 53 text

make data visualizations

Slide 54

Slide 54 text

🗓 Every Tuesday 🔗 github.com/rfordatascience/tidytuesday 🐦 #TidyTuesday

Slide 55

Slide 55 text

speak at events

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

write blog posts

Slide 58

Slide 58 text

bookdown.org/yihui/blogdown apreshill.com/blog/2020-12-new-year-new-blogdown

Slide 59

Slide 59 text

keeping a blog alive find co-authors keep it regular write themed posts review events

Slide 60

Slide 60 text

1 2 3 4 5 6 always be curious improve your workflow share your output contribute to community collaborate with others broadcast your work

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

mine-cetinkaya-rundel [email protected] @minebocek growing your inner data scientist 🔗 bit.ly/grow-ds-21 Mine Çetinkaya-Rundel Duke University + RStudio