Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Build a Strong Data Science Portfolio

Emily Robinson
June 24, 2019
150

Build a Strong Data Science Portfolio

Emily Robinson

June 24, 2019
Tweet

Transcript

  1. Emily Robinson
    Data Scientist at DataCamp
    Build a Strong Data Science
    Portfolio

    View full-size slide

  2. About Me
    ➔ Background in the social sciences
    ➔ R programmer
    ➔ Formerly at Etsy
    ➔ Data Scientist at DataCamp
    ➔ Writing a book on data science careers
    with Jacqueline Nolis

    View full-size slide

  3. Why have a portfolio?

    View full-size slide

  4. Build your skills

    View full-size slide

  5. Advance your Career
    http://varianceexplained.org/r/year_data_scientist/

    View full-size slide

  6. Expand your Network

    View full-size slide

  7. Some components
    Analyses
    Blog posts
    Talks
    Apps
    Open Source Projects

    View full-size slide

  8. Dataset -> Question

    View full-size slide

  9. Dataset -> Question

    View full-size slide

  10. Question -> Dataset
    http://varianceexplained.org/r/trump-tweets/

    View full-size slide

  11. Tip 1: Include visualizations
    https://hackernoon.com/more-than-a-million-pro-repeal-net-neutrality-comments-were-likely-faked-e9f0e3ed36a6

    View full-size slide

  12. Tip 2: hoose a topic you’re excited about
    https://masalmon.eu/2018/01/01/sortinghat/

    View full-size slide

  13. Tip 3: Limit your scope
    https://kkulma.github.io/2017-08-13-friendships-among-top-r-twitterers/

    View full-size slide

  14. Making progress
    Inspired by bit.ly/drob-rstudio-2019
    Less valuable More valuable
    Idea Getting data Cleaning Exploratory Final result
    Modeling
    Less valuable More valuable
    Work only on
    your computer
    Work online
    (GitHub, Blog, Kaggle)
    How I used to think about analyses
    How I think about analyses now

    View full-size slide

  15. The Full process

    View full-size slide

  16. Put it on GitHub

    View full-size slide

  17. Where?
    ➔ Easy & quick to set up
    ➔ Organic traffic (medium)
    ➔ Less customizability/control

    View full-size slide

  18. Where?
    ➔ Complete control
    ➔ Always free
    ➔ Little longer to set-up
    ➔ May get stuck debugging issues

    View full-size slide

  19. Explain your analysis
    https://theambitiouseconomist.com/an-analysis-of-the-gender-wage-gap-in-australia/

    View full-size slide

  20. Teach a concept
    https://juliasilge.com/blog/stack-overflow-pca/

    View full-size slide

  21. Share your experience
    https://d4tagirl.com/2018/08/rstudio-conf-diversity-scholarships-for-the-win

    View full-size slide

  22. Give advice
    www.rladiesnyc.org/post/2019-nyr-conference-tips/ towardsdatascience.com/prioritizing-data-science-work-936b3765fd45

    View full-size slide

  23. Not just for extroverts!

    View full-size slide

  24. Where to start?

    View full-size slide

  25. First talk
    Jared Lander
    (meetup organizer)
    asked me to speak

    View full-size slide

  26. Snowball effect

    View full-size slide

  27. CFPs (Call for Proposals)
    ➔ Look for conferences offering first-
    time speaker help
    ➔ Be succinct
    ➔ Focus on the outcome for attendees
    ➔ Make it topical
    ➔ R-Ladies abstract review

    View full-size slide

  28. App for following conference tweets
    https://gadenbuie.shinyapps.io/tweet-conf-dash/

    View full-size slide

  29. Tweet mashup
    https://tweetmashup.com/

    View full-size slide

  30. Open Source Projects

    View full-size slide

  31. Help yourself & others
    https://www.rstudio.com/resources/videos/contributing-to-tidyverse-packages/

    View full-size slide

  32. Documentation is a great place to start
    There are many ways to contribute to
    scikit-learn … Improving the
    documentation is no less important than
    improving the library itself.
    - From scikit-learn contributing guide (emphasis mine)
    https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md

    View full-size slide

  33. Is anything too small?
    https://jcahoon.netlify.com/post/2019/06/16/first-two-weeks-this-summer-at-rstudio/

    View full-size slide

  34. The process
    https://thisisnic.github.io/2018/11/28/ten-steps-to-becoming-a-tidyverse-contributor/
    ➔ Watch the repos
    ➔ Ask if you can help w/ issues
    ➔ Make the changes & submit a
    pull request

    View full-size slide

  35. Contribute to beginner-friendly issues
    bit.ly/drob-rstudio-2019

    View full-size slide

  36. If you’ve copied and pasted
    code three times, write a
    function
    If you’ve used the same
    function across three
    analyses, write a package
    Making your own package/library

    View full-size slide

  37. Isn’t making a package for “advanced programmers”?
    From Susan Johnston, as used in Jim Hester’s RStudio 2018 conference talk
    https://resources.rstudio.com/rstudio-conf-2018/you-can-make-a-package-in-20-minutes-jim-hester
    qCan you open and run R / Python?
    qCan you install a package?
    qCan you write code?
    qCan you write a function?
    qCan you learn to write a function?
    Excellent, you can write a package!

    View full-size slide

  38. Potential Components
    Analyses
    Blog posts
    Talks
    Apps
    Open Source Projects

    View full-size slide

  39. Additional resources
    ➔ Making Peace with Personal Branding by Rachel Thomas
    ➔ Overcoming Social Anxiety by Steph Locke
    ➔ Keeping up with blogdown by Mara Averick
    ➔ Advice to Aspiring Data Scientists: Start a Blog by David Robinson
    ➔ Speaking at conference by Cassie Kozyrkov
    ➔ List of speaking resources
    ➔ R packages book by Hadley Wickham and Jenny Bryan

    View full-size slide

  40. Thank you!
    bit.ly/erdatamatters
    hookedondata.org
    @robinson_es
    bit.ly/buildcareerds

    View full-size slide