Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The art and science of teaching data science (N...

The art and science of teaching data science (Nordstat)

Modern statistics is fundamentally a computational discipline, but too often this fact is not reflected in our statistics curricula. With the rise of data science, it has become increasingly clear that students want, expect, and need explicit training in this area of the discipline. Additionally, recent curricular guidelines clearly state that working with data requires extensive computing skills and that statistics students should be fluent in accessing, manipulating, analyzing, and modeling with professional statistical analysis software. In this talk, we introduce the design philosophy behind an introductory data science course, discuss in progress and future research on student learning as well as new directions in assessment and tooling as we scale up the course.

Mine Cetinkaya-Rundel

June 22, 2021
Tweet

More Decks by Mine Cetinkaya-Rundel

Other Decks in Education

Transcript

  1. Image credit: Thomas Pedersen, data-imaginist.com/art the art and science of

    teaching data science mine çetinkaya-rundel bit.ly/ds-art-sci-nordstat mine-cetinkaya-rundel [email protected] @minebocek duke university & rstudio
  2. How can we effectively and ef fi ciently teach data

    science to students with little to no background in computing and statistical thinking? How can we equip them with the skills and tools for reasoning with various types of data and leave them wanting to learn more?
  3. your fi rst data visualization + R / RStudio, R

    Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub
  4. your fi rst data visualization + R / RStudio, R

    Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub ethical considerations around misrepresentation of data, relying on ML algorithms and the biases they might carry, privacy of one’s own data and reusing others’ data
  5. your fi rst data visualization + R / RStudio, R

    Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub ethical considerations around misrepresentation of data, relying on ML algorithms and the biases they might carry, privacy of one’s own data and reusing others’ data building & selecting models, visualising interactions, prediction & validation, inference via simulation
  6. your fi rst data visualization + R / RStudio, R

    Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub ethical considerations around misrepresentation of data, relying on ML algorithms and the biases they might carry, privacy of one’s own data and reusing others’ data building & selecting models, visualising interactions, prediction & validation, inference via simulation choose your own adventure: text analysis, Bayesian inference, Interactive visualization and reporting + communication & dissemination
  7. ‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the

    project titled UN Votes ‣ Open the R Markdown document called unvotes.Rmd
  8. ‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the

    project titled UN Votes ‣ Open the R Markdown document called unvotes.Rmd ‣ Knit the document and review the data visualisation you just produced
  9. ‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the

    project titled UN Votes ‣ Open the R Markdown document called unvotes.Rmd ‣ Knit the document and review the data visualisation you just produced ‣ Then, look for “France” in the code and replace it with another country Knit again, and review how the voting patterns of the country you picked compares to the United States and United Kingdom
  10. three questions that keep me up at night… 1 what

    should students learn? 2 how will students learn best? 3 what tools will enhance student learning?
  11. three questions that keep me up at night… 1 what

    should students learn? 2 how will students learn best? 3 what tools will enhance student learning? content pedagogy infrastructure
  12. ✴ data joins ✴ data science ethics ✴ critique ✴

    improving data visualisations ✴ mapping
  13. Project: 2016 US Election Redux Question: Would the outcome of

    the 2016 US Presidential Elections been di ff erent had Bernie Sanders been the Democrat candidate? Team: 4 Squared
  14. ✴ web scraping ✴ text parsing ✴ data types ✴

    regular expressions ✴ functions ✴ iteration
  15. ✴ web scraping ✴ text parsing ✴ data types ✴

    regular expressions ✴ functions ✴ iteration ✴ data visualisation ✴ interpretation
  16. ✴ web scraping ✴ text parsing ✴ data types ✴

    regular expressions ✴ functions ✴ iteration ✴ data visualisation ✴ interpretation ✴ text analysis
  17. ✴ web scraping ✴ text parsing ✴ data types ✴

    regular expressions ✴ functions ✴ iteration ✴ data visualisation ✴ interpretation ✴ text analysis ✴ data science ethics robotstxt::paths_allowed("https://www.gov.scot") #> www.gov.scot #> [1] TRUE
  18. Project: The North South Divide: University Edition Question: Does the

    geographical location of a UK university a ff ect its university score? Team: Fried Egg Jelly Fish
  19. teams: weekly labs in teams + periodic team evaluations +

    term project in teams “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material
  20. # A tibble: 19 x 2 bigram n <chr> <int>

    1 question 7 19 2 question 8 16 3 questions 7 12 4 join function 9 5 question 2 9 6 choice questions 7 7 first question 7 8 multiple choice 7 9 correct answer 6 10 necessarily improve 6 11 join functions 5 12 question 1 5 13 7 8 4 14 airline names 4 15 data frames 4 16 feel like 4 17 many options 4 18 right answer 4 19 x axis 4
  21. teams: weekly labs in teams + periodic team evaluations +

    term project in teams peer feedback: on projects “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material
  22. teams: weekly labs in teams + periodic team evaluations +

    term project in teams peer feedback: on projects “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material web native (aka COVID friendly)
  23. teams: weekly labs in teams + periodic team evaluations +

    term project in teams peer feedback: on projects “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material web native (aka COVID friendly) creativity: assignments that make room for creativity
  24. student-facing + 📦 ghclass + instructor-facing 📦 checklist + +

    📦 learnr + 📦 parsermd 📦 gradethis 📦 learnrhash
  25. on

  26. Image credit: Thomas Pedersen, data-imaginist.com/art the art and science of

    teaching data science mine çetinkaya-rundel mine-cetinkaya-rundel [email protected] @minebocek bit.ly/ds-art-sci-nordstat duke university & rstudio