Lab Stephanie Hicks Assistant Professor, Biostatistics Johns Hopkins Bloomberg School of Public Health Faculty Member Johns Hopkins Data Science Lab @stephaniehicks
• R/Bioconductor user and developer (since 2009/2010) Other fun things about me: • Co-founded Baltimore • Creating a children’s book featuring women statisticians and data scientists ABOUT ME JOHNS HOPKINS BLOOMBERG SCHOOL OF PUBLIC HEALTH
developing solutions to practical problems by data analysis problems • Galton, Ronald Fisher • Wild and Pfannkuch (1999) describe applied statistics as: • A department that embraces applied statistics defined above is a natural home for data science in academia “part of the information gathering and learning process which, in an ideal world, is undertaken to inform decisions and actions. With industry, medicine and many other sectors of society increasingly relying on data for decision making, statistics should be an integral part of the emerging information era.”
Pfannhuch (1999) complained that: “Large parts of the investigative process, such as problem analysis and measurement, have been largely abandoned by statisticians and statistics educators to the realm of the particular, perhaps to be developed separately within other disciplines.” They add that “[t]he arid, context-free landscape on which so many examples used in statistics teaching are built ensures that large numbers of students never even see, let alone engage in, statistical thinking.”
Creating • Need more computing in the curriculum • Need to teach how to connect the subject matter question to appropriate dataset and analysis tools • Instead of being passive, teach students to be active and how create and formulate questions to investigate hypotheses with data
science courses • Educators need to be experienced themselves in creating, connecting and computing • Encourage applied statisticians experienced in creating, connecting, and computing to become involved in the development of courses • Encourage statistics departments to reach out to practicing data analysts, perhaps in other departments or from other disciplines, to collaborate in developing these courses
a set of diverse case studies • Integrate computing into every aspect of the course • Teach abstraction, but minimize reliance on mathematical notation • Structure course activities to realistically mimic a data scientist’s experience • Demonstrate the importance of critical thinking / skepticism through examples
count What is your age? clincial effectiveness non−degree quantitative methods global health social and behavorial sciences MPH health policy environmental health computational biology biostatistics epidemiology 0 5 10 15 count What is your primary concentration? VB/VBScript Ruby Perl SQL BASIC Java Python C / C++ R 0 10 20 30 count What is your primary programming language? Less comfortable More comfortable 0 5 10 15 20 1 2 3 4 5 count Overall, how comfortable are you with programming? 0 10 20 <6mos 6mos − 1yr 1−3yrs >3yrs count How long have you been programming? A B C D E