Slide 1

Slide 1 text

Image credit: Thomas Pedersen, data-imaginist.com/art the art and science of teaching data science mine çetinkaya-rundel bit.ly/ds-art-sci-nordstat mine-cetinkaya-rundel [email protected] @minebocek duke university & rstudio

Slide 2

Slide 2 text

How can we effectively and ef fi ciently teach data science to students with little to no background in computing and statistical thinking? How can we equip them with the skills and tools for reasoning with various types of data and leave them wanting to learn more?

Slide 3

Slide 3 text

demonstrate concrete course examples share a few tips provide open-source teaching resources goals

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

your fi rst data visualization + R / RStudio, R Markdown, simple Git

Slide 6

Slide 6 text

your fi rst data visualization + R / RStudio, R Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub

Slide 7

Slide 7 text

your fi rst data visualization + R / RStudio, R Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub ethical considerations around misrepresentation of data, relying on ML algorithms and the biases they might carry, privacy of one’s own data and reusing others’ data

Slide 8

Slide 8 text

your fi rst data visualization + R / RStudio, R Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub ethical considerations around misrepresentation of data, relying on ML algorithms and the biases they might carry, privacy of one’s own data and reusing others’ data building & selecting models, visualising interactions, prediction & validation, inference via simulation

Slide 9

Slide 9 text

your fi rst data visualization + R / RStudio, R Markdown, simple Git fundamentals of data & data viz, confounding variables, Simpson’s paradox, tidy data, recoding & transforming, web scraping & iteration + collaboration on GitHub ethical considerations around misrepresentation of data, relying on ML algorithms and the biases they might carry, privacy of one’s own data and reusing others’ data building & selecting models, visualising interactions, prediction & validation, inference via simulation choose your own adventure: text analysis, Bayesian inference, Interactive visualization and reporting + communication & dissemination

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the project titled UN Votes

Slide 12

Slide 12 text

‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the project titled UN Votes ‣ Open the R Markdown document called unvotes.Rmd

Slide 13

Slide 13 text

‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the project titled UN Votes ‣ Open the R Markdown document called unvotes.Rmd ‣ Knit the document and review the data visualisation you just produced

Slide 14

Slide 14 text

‣ Go to RStudio Cloud - bit.ly/dsbox-cloud ‣ Start the project titled UN Votes ‣ Open the R Markdown document called unvotes.Rmd ‣ Knit the document and review the data visualisation you just produced ‣ Then, look for “France” in the code and replace it with another country Knit again, and review how the voting patterns of the country you picked compares to the United States and United Kingdom

Slide 15

Slide 15 text

three questions that keep me up at night… 1 what should students learn? 2 how will students learn best? 3 what tools will enhance student learning?

Slide 16

Slide 16 text

three questions that keep me up at night… 1 what should students learn? 2 how will students learn best? 3 what tools will enhance student learning? content pedagogy infrastructure

Slide 17

Slide 17 text

content

Slide 18

Slide 18 text

ex. 1 fi sheries of the world

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

✴ data joins

Slide 21

Slide 21 text

✴ data joins ✴ data science ethics

Slide 22

Slide 22 text

✴ data joins ✴ data science ethics ✴ critique ✴ improving data visualisations

Slide 23

Slide 23 text

✴ data joins ✴ data science ethics ✴ critique ✴ improving data visualisations ✴ mapping

Slide 24

Slide 24 text

Project: 2016 US Election Redux Question: Would the outcome of the 2016 US Presidential Elections been di ff erent had Bernie Sanders been the Democrat candidate? Team: 4 Squared

Slide 25

Slide 25 text

ex. 2 First Minister’s COVID brie fi ngs

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

✴ web scraping ✴ text parsing ✴ data types ✴ regular expressions

Slide 28

Slide 28 text

✴ web scraping ✴ text parsing ✴ data types ✴ regular expressions ✴ functions ✴ iteration

Slide 29

Slide 29 text

✴ web scraping ✴ text parsing ✴ data types ✴ regular expressions ✴ functions ✴ iteration ✴ data visualisation ✴ interpretation

Slide 30

Slide 30 text

✴ web scraping ✴ text parsing ✴ data types ✴ regular expressions ✴ functions ✴ iteration ✴ data visualisation ✴ interpretation ✴ text analysis

Slide 31

Slide 31 text

✴ web scraping ✴ text parsing ✴ data types ✴ regular expressions ✴ functions ✴ iteration ✴ data visualisation ✴ interpretation ✴ text analysis ✴ data science ethics robotstxt::paths_allowed("https://www.gov.scot") #> www.gov.scot #> [1] TRUE

Slide 32

Slide 32 text

Project: The North South Divide: University Edition Question: Does the geographical location of a UK university a ff ect its university score? Team: Fried Egg Jelly Fish

Slide 33

Slide 33 text

pedagogy

Slide 34

Slide 34 text

teams: weekly labs in teams + periodic team evaluations + term project in teams

Slide 35

Slide 35 text

teams: weekly labs in teams + periodic team evaluations + term project in teams “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

# A tibble: 19 x 2 bigram n 1 question 7 19 2 question 8 16 3 questions 7 12 4 join function 9 5 question 2 9 6 choice questions 7 7 first question 7 8 multiple choice 7 9 correct answer 6 10 necessarily improve 6 11 join functions 5 12 question 1 5 13 7 8 4 14 airline names 4 15 data frames 4 16 feel like 4 17 many options 4 18 right answer 4 19 x axis 4

Slide 38

Slide 38 text

teams: weekly labs in teams + periodic team evaluations + term project in teams peer feedback: on projects “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

teams: weekly labs in teams + periodic team evaluations + term project in teams peer feedback: on projects “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material web native (aka COVID friendly)

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

teams: weekly labs in teams + periodic team evaluations + term project in teams peer feedback: on projects “minute paper”: weekly online quizzes ending with a brief re fl ection of the week’s material web native (aka COVID friendly) creativity: assignments that make room for creativity

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

infrastructure & tooling

Slide 46

Slide 46 text

student-facing + 📦 ghclass + instructor-facing 📦 checklist + + 📦 learnr + 📦 parsermd 📦 gradethis 📦 learnrhash

Slide 47

Slide 47 text

📦 ghclass + +

Slide 48

Slide 48 text

openness

Slide 49

Slide 49 text

datasciencebox.org

Slide 50

Slide 50 text

rstudio-education.github.io/dsbox

Slide 51

Slide 51 text

on

Slide 52

Slide 52 text

introds.org

Slide 53

Slide 53 text

rstd.io/design-ds-class

Slide 54

Slide 54 text

Image credit: Thomas Pedersen, data-imaginist.com/art the art and science of teaching data science mine çetinkaya-rundel mine-cetinkaya-rundel [email protected] @minebocek bit.ly/ds-art-sci-nordstat duke university & rstudio