Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Build a Strong Data Science Portfolio

Emily Robinson
June 24, 2019
130

Build a Strong Data Science Portfolio

Emily Robinson

June 24, 2019
Tweet

Transcript

  1. Emily Robinson
    Data Scientist at DataCamp
    Build a Strong Data Science
    Portfolio

    View Slide

  2. About Me
    ➔ Background in the social sciences
    ➔ R programmer
    ➔ Formerly at Etsy
    ➔ Data Scientist at DataCamp
    ➔ Writing a book on data science careers
    with Jacqueline Nolis

    View Slide

  3. Why have a portfolio?

    View Slide

  4. Build your skills

    View Slide

  5. Advance your Career
    http://varianceexplained.org/r/year_data_scientist/

    View Slide

  6. Expand your Network

    View Slide

  7. Give Back

    View Slide

  8. Some components
    Analyses
    Blog posts
    Talks
    Apps
    Open Source Projects

    View Slide

  9. Analyses

    View Slide

  10. How?

    View Slide

  11. Dataset -> Question

    View Slide

  12. Dataset -> Question

    View Slide

  13. Question -> Dataset
    http://varianceexplained.org/r/trump-tweets/

    View Slide

  14. Tip 1: Include visualizations
    https://hackernoon.com/more-than-a-million-pro-repeal-net-neutrality-comments-were-likely-faked-e9f0e3ed36a6

    View Slide

  15. Tip 2: hoose a topic you’re excited about
    https://masalmon.eu/2018/01/01/sortinghat/

    View Slide

  16. Tip 3: Limit your scope
    https://kkulma.github.io/2017-08-13-friendships-among-top-r-twitterers/

    View Slide

  17. Making progress
    Inspired by bit.ly/drob-rstudio-2019
    Less valuable More valuable
    Idea Getting data Cleaning Exploratory Final result
    Modeling
    Less valuable More valuable
    Work only on
    your computer
    Work online
    (GitHub, Blog, Kaggle)
    How I used to think about analyses
    How I think about analyses now

    View Slide

  18. The Full process

    View Slide

  19. Put it on GitHub

    View Slide

  20. Blog posts

    View Slide

  21. Where?
    ➔ Easy & quick to set up
    ➔ Organic traffic (medium)
    ➔ Less customizability/control

    View Slide

  22. Where?
    ➔ Complete control
    ➔ Always free
    ➔ Little longer to set-up
    ➔ May get stuck debugging issues

    View Slide

  23. Explain your analysis
    https://theambitiouseconomist.com/an-analysis-of-the-gender-wage-gap-in-australia/

    View Slide

  24. Teach a concept
    https://juliasilge.com/blog/stack-overflow-pca/

    View Slide

  25. Share your experience
    https://d4tagirl.com/2018/08/rstudio-conf-diversity-scholarships-for-the-win

    View Slide

  26. Give advice
    www.rladiesnyc.org/post/2019-nyr-conference-tips/ towardsdatascience.com/prioritizing-data-science-work-936b3765fd45

    View Slide

  27. Talks

    View Slide

  28. Not just for extroverts!

    View Slide

  29. Where to start?

    View Slide

  30. First talk
    Jared Lander
    (meetup organizer)
    asked me to speak

    View Slide

  31. Snowball effect

    View Slide

  32. CFPs (Call for Proposals)
    ➔ Look for conferences offering first-
    time speaker help
    ➔ Be succinct
    ➔ Focus on the outcome for attendees
    ➔ Make it topical
    ➔ R-Ladies abstract review

    View Slide

  33. Apps

    View Slide

  34. App for following conference tweets
    https://gadenbuie.shinyapps.io/tweet-conf-dash/

    View Slide

  35. Tweet mashup
    https://tweetmashup.com/

    View Slide

  36. How?

    View Slide

  37. Share it!

    View Slide

  38. Open Source Projects

    View Slide

  39. Help yourself & others
    https://www.rstudio.com/resources/videos/contributing-to-tidyverse-packages/

    View Slide

  40. Documentation is a great place to start
    There are many ways to contribute to
    scikit-learn … Improving the
    documentation is no less important than
    improving the library itself.
    - From scikit-learn contributing guide (emphasis mine)
    https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md

    View Slide

  41. Is anything too small?
    https://jcahoon.netlify.com/post/2019/06/16/first-two-weeks-this-summer-at-rstudio/

    View Slide

  42. The process
    https://thisisnic.github.io/2018/11/28/ten-steps-to-becoming-a-tidyverse-contributor/
    ➔ Watch the repos
    ➔ Ask if you can help w/ issues
    ➔ Make the changes & submit a
    pull request

    View Slide

  43. Contribute to beginner-friendly issues
    bit.ly/drob-rstudio-2019

    View Slide

  44. If you’ve copied and pasted
    code three times, write a
    function
    If you’ve used the same
    function across three
    analyses, write a package
    Making your own package/library

    View Slide

  45. Isn’t making a package for “advanced programmers”?
    From Susan Johnston, as used in Jim Hester’s RStudio 2018 conference talk
    https://resources.rstudio.com/rstudio-conf-2018/you-can-make-a-package-in-20-minutes-jim-hester
    qCan you open and run R / Python?
    qCan you install a package?
    qCan you write code?
    qCan you write a function?
    qCan you learn to write a function?
    Excellent, you can write a package!

    View Slide

  46. Conclusion

    View Slide

  47. Potential Components
    Analyses
    Blog posts
    Talks
    Apps
    Open Source Projects

    View Slide

  48. Additional resources
    ➔ Making Peace with Personal Branding by Rachel Thomas
    ➔ Overcoming Social Anxiety by Steph Locke
    ➔ Keeping up with blogdown by Mara Averick
    ➔ Advice to Aspiring Data Scientists: Start a Blog by David Robinson
    ➔ Speaking at conference by Cassie Kozyrkov
    ➔ List of speaking resources
    ➔ R packages book by Hadley Wickham and Jenny Bryan

    View Slide

  49. Thank you!
    bit.ly/erdatamatters
    hookedondata.org
    @robinson_es
    bit.ly/buildcareerds

    View Slide

  50. Benefits

    View Slide