Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The art and science of teaching data science (Nordstat)

The art and science of teaching data science (Nordstat)

Modern statistics is fundamentally a computational discipline, but too often this fact is not reflected in our statistics curricula. With the rise of data science, it has become increasingly clear that students want, expect, and need explicit training in this area of the discipline. Additionally, recent curricular guidelines clearly state that working with data requires extensive computing skills and that statistics students should be fluent in accessing, manipulating, analyzing, and modeling with professional statistical analysis software. In this talk, we introduce the design philosophy behind an introductory data science course, discuss in progress and future research on student learning as well as new directions in assessment and tooling as we scale up the course.

81689b093f75cf3f383e581ca57188df?s=128

Mine Cetinkaya-Rundel

June 22, 2021
Tweet

Transcript

  1. Image credit: Thomas Pedersen, data-imaginist.com/art the art and science of

    teaching data science mine çetinkaya-rundel bit.ly/ds-art-sci-nordstat mine-cetinkaya-rundel cetinkaya.mine@gmail.com @minebocek duke university & rstudio
  2. How can we effectively and ef fi ciently teach data

    science to students with little to no background in computing and statistical thinking? How can we equip them with the skills and tools for reasoning with various types of data and leave them wanting to learn more?
  3. demonstrate concrete course examples share a few tips provide open-source

    teaching resources goals
  4. None
  5. your fi rst data visualization + R / RStudio, R

    Markdown, simple Git
  6. your fi rst data visualization + R / RStudio, R

    Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub
  7. your fi rst data visualization + R / RStudio, R

    Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub ethical considerations around misrepresentation of data, relying on ML algorithms and the biases they might carry, privacy of one’s own data and reusing others’ data
  8. your fi rst data visualization + R / RStudio, R

    Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub ethical considerations around misrepresentation of data, relying on ML algorithms and the biases they might carry, privacy of one’s own data and reusing others’ data building & selecting models, visualising interactions, prediction & validation, inference via simulation
  9. your fi rst data visualization + R / RStudio, R

    Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub ethical considerations around misrepresentation of data, relying on ML algorithms and the biases they might carry, privacy of one’s own data and reusing others’ data building & selecting models, visualising interactions, prediction & validation, inference via simulation choose your own adventure: text analysis, Bayesian inference, Interactive visualization and reporting + communication & dissemination
  10. None
  11. ‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the

    project titled UN Votes
  12. ‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the

    project titled UN Votes ‣ Open the R Markdown document called unvotes.Rmd
  13. ‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the

    project titled UN Votes ‣ Open the R Markdown document called unvotes.Rmd ‣ Knit the document and review the data visualisation you just produced
  14. ‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the

    project titled UN Votes ‣ Open the R Markdown document called unvotes.Rmd ‣ Knit the document and review the data visualisation you just produced ‣ Then, look for “France” in the code and replace it with another country Knit again, and review how the voting patterns of the country you picked compares to the United States and United Kingdom
  15. three questions that keep me up at night… 1 what

    should students learn? 2 how will students learn best? 3 what tools will enhance student learning?
  16. three questions that keep me up at night… 1 what

    should students learn? 2 how will students learn best? 3 what tools will enhance student learning? content pedagogy infrastructure
  17. content

  18. ex. 1 fi sheries of the world

  19. None
  20. ✴ data joins

  21. ✴ data joins ✴ data science ethics

  22. ✴ data joins ✴ data science ethics ✴ critique ✴

    improving data visualisations
  23. ✴ data joins ✴ data science ethics ✴ critique ✴

    improving data visualisations ✴ mapping
  24. Project: 2016 US Election Redux Question: Would the outcome of

    the 2016 US Presidential Elections been di ff erent had Bernie Sanders been the Democrat candidate? Team: 4 Squared
  25. ex. 2 First Minister’s COVID brie fi ngs

  26. None
  27. ✴ web scraping ✴ text parsing ✴ data types ✴

    regular expressions
  28. ✴ web scraping ✴ text parsing ✴ data types ✴

    regular expressions ✴ functions ✴ iteration
  29. ✴ web scraping ✴ text parsing ✴ data types ✴

    regular expressions ✴ functions ✴ iteration ✴ data visualisation ✴ interpretation
  30. ✴ web scraping ✴ text parsing ✴ data types ✴

    regular expressions ✴ functions ✴ iteration ✴ data visualisation ✴ interpretation ✴ text analysis
  31. ✴ web scraping ✴ text parsing ✴ data types ✴

    regular expressions ✴ functions ✴ iteration ✴ data visualisation ✴ interpretation ✴ text analysis ✴ data science ethics robotstxt::paths_allowed("https://www.gov.scot") #> www.gov.scot #> [1] TRUE
  32. Project: The North South Divide: University Edition Question: Does the

    geographical location of a UK university a ff ect its university score? Team: Fried Egg Jelly Fish
  33. pedagogy

  34. teams: weekly labs in teams + periodic team evaluations +

    term project in teams
  35. teams: weekly labs in teams + periodic team evaluations +

    term project in teams “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material
  36. None
  37. # A tibble: 19 x 2 bigram n <chr> <int>

    1 question 7 19 2 question 8 16 3 questions 7 12 4 join function 9 5 question 2 9 6 choice questions 7 7 first question 7 8 multiple choice 7 9 correct answer 6 10 necessarily improve 6 11 join functions 5 12 question 1 5 13 7 8 4 14 airline names 4 15 data frames 4 16 feel like 4 17 many options 4 18 right answer 4 19 x axis 4
  38. teams: weekly labs in teams + periodic team evaluations +

    term project in teams peer feedback: on projects “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material
  39. None
  40. teams: weekly labs in teams + periodic team evaluations +

    term project in teams peer feedback: on projects “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material web native (aka COVID friendly)
  41. None
  42. teams: weekly labs in teams + periodic team evaluations +

    term project in teams peer feedback: on projects “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material web native (aka COVID friendly) creativity: assignments that make room for creativity
  43. None
  44. None
  45. infrastructure & tooling

  46. student-facing + 📦 ghclass + instructor-facing 📦 checklist + +

    📦 learnr + 📦 parsermd 📦 gradethis 📦 learnrhash
  47. 📦 ghclass + +

  48. openness

  49. datasciencebox.org

  50. rstudio-education.github.io/dsbox

  51. on

  52. introds.org

  53. rstd.io/design-ds-class

  54. Image credit: Thomas Pedersen, data-imaginist.com/art the art and science of

    teaching data science mine çetinkaya-rundel mine-cetinkaya-rundel cetinkaya.mine@gmail.com @minebocek bit.ly/ds-art-sci-nordstat duke university & rstudio