Slide 1

Slide 1 text

Emily Robinson Data Scientist at DataCamp Build a Strong Data Science Portfolio

Slide 2

Slide 2 text

About Me ➔ Background in the social sciences ➔ R programmer ➔ Formerly at Etsy ➔ Data Scientist at DataCamp ➔ Writing a book on data science careers with Jacqueline Nolis

Slide 3

Slide 3 text

Why have a portfolio?

Slide 4

Slide 4 text

Build your skills

Slide 5

Slide 5 text

Advance your Career http://varianceexplained.org/r/year_data_scientist/

Slide 6

Slide 6 text

Expand your Network

Slide 7

Slide 7 text

Give Back

Slide 8

Slide 8 text

Some components Analyses Blog posts Talks Apps Open Source Projects

Slide 9

Slide 9 text

Analyses

Slide 10

Slide 10 text

How?

Slide 11

Slide 11 text

Dataset -> Question

Slide 12

Slide 12 text

Dataset -> Question

Slide 13

Slide 13 text

Question -> Dataset http://varianceexplained.org/r/trump-tweets/

Slide 14

Slide 14 text

Tip 1: Include visualizations https://hackernoon.com/more-than-a-million-pro-repeal-net-neutrality-comments-were-likely-faked-e9f0e3ed36a6

Slide 15

Slide 15 text

Tip 2: hoose a topic you’re excited about https://masalmon.eu/2018/01/01/sortinghat/

Slide 16

Slide 16 text

Tip 3: Limit your scope https://kkulma.github.io/2017-08-13-friendships-among-top-r-twitterers/

Slide 17

Slide 17 text

Making progress Inspired by bit.ly/drob-rstudio-2019 Less valuable More valuable Idea Getting data Cleaning Exploratory Final result Modeling Less valuable More valuable Work only on your computer Work online (GitHub, Blog, Kaggle) How I used to think about analyses How I think about analyses now

Slide 18

Slide 18 text

The Full process

Slide 19

Slide 19 text

Put it on GitHub

Slide 20

Slide 20 text

Blog posts

Slide 21

Slide 21 text

Where? ➔ Easy & quick to set up ➔ Organic traffic (medium) ➔ Less customizability/control

Slide 22

Slide 22 text

Where? ➔ Complete control ➔ Always free ➔ Little longer to set-up ➔ May get stuck debugging issues

Slide 23

Slide 23 text

Explain your analysis https://theambitiouseconomist.com/an-analysis-of-the-gender-wage-gap-in-australia/

Slide 24

Slide 24 text

Teach a concept https://juliasilge.com/blog/stack-overflow-pca/

Slide 25

Slide 25 text

Share your experience https://d4tagirl.com/2018/08/rstudio-conf-diversity-scholarships-for-the-win

Slide 26

Slide 26 text

Give advice www.rladiesnyc.org/post/2019-nyr-conference-tips/ towardsdatascience.com/prioritizing-data-science-work-936b3765fd45

Slide 27

Slide 27 text

Talks

Slide 28

Slide 28 text

Not just for extroverts!

Slide 29

Slide 29 text

Where to start?

Slide 30

Slide 30 text

First talk Jared Lander (meetup organizer) asked me to speak

Slide 31

Slide 31 text

Snowball effect

Slide 32

Slide 32 text

CFPs (Call for Proposals) ➔ Look for conferences offering first- time speaker help ➔ Be succinct ➔ Focus on the outcome for attendees ➔ Make it topical ➔ R-Ladies abstract review

Slide 33

Slide 33 text

Apps

Slide 34

Slide 34 text

App for following conference tweets https://gadenbuie.shinyapps.io/tweet-conf-dash/

Slide 35

Slide 35 text

Tweet mashup https://tweetmashup.com/

Slide 36

Slide 36 text

How?

Slide 37

Slide 37 text

Share it!

Slide 38

Slide 38 text

Open Source Projects

Slide 39

Slide 39 text

Help yourself & others https://www.rstudio.com/resources/videos/contributing-to-tidyverse-packages/

Slide 40

Slide 40 text

Documentation is a great place to start There are many ways to contribute to scikit-learn … Improving the documentation is no less important than improving the library itself. - From scikit-learn contributing guide (emphasis mine) https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md

Slide 41

Slide 41 text

Is anything too small? https://jcahoon.netlify.com/post/2019/06/16/first-two-weeks-this-summer-at-rstudio/

Slide 42

Slide 42 text

The process https://thisisnic.github.io/2018/11/28/ten-steps-to-becoming-a-tidyverse-contributor/ ➔ Watch the repos ➔ Ask if you can help w/ issues ➔ Make the changes & submit a pull request

Slide 43

Slide 43 text

Contribute to beginner-friendly issues bit.ly/drob-rstudio-2019

Slide 44

Slide 44 text

If you’ve copied and pasted code three times, write a function If you’ve used the same function across three analyses, write a package Making your own package/library

Slide 45

Slide 45 text

Isn’t making a package for “advanced programmers”? From Susan Johnston, as used in Jim Hester’s RStudio 2018 conference talk https://resources.rstudio.com/rstudio-conf-2018/you-can-make-a-package-in-20-minutes-jim-hester qCan you open and run R / Python? qCan you install a package? qCan you write code? qCan you write a function? qCan you learn to write a function? Excellent, you can write a package!

Slide 46

Slide 46 text

Conclusion

Slide 47

Slide 47 text

Potential Components Analyses Blog posts Talks Apps Open Source Projects

Slide 48

Slide 48 text

Additional resources ➔ Making Peace with Personal Branding by Rachel Thomas ➔ Overcoming Social Anxiety by Steph Locke ➔ Keeping up with blogdown by Mara Averick ➔ Advice to Aspiring Data Scientists: Start a Blog by David Robinson ➔ Speaking at conference by Cassie Kozyrkov ➔ List of speaking resources ➔ R packages book by Hadley Wickham and Jenny Bryan

Slide 49

Slide 49 text

Thank you! bit.ly/erdatamatters hookedondata.org @robinson_es bit.ly/buildcareerds

Slide 50

Slide 50 text

Benefits