Slide 1

Slide 1 text

The {medicaldata} Teaching Package Peter D.R. Higgins University of Michigan, Ann Arbor, Michigan, USA @ibddoctor $1.62B/yr research grants

Slide 2

Slide 2 text

Why a medical data package for teaching? • Learners find relevant examples motivating • These datasets illustrate data challenges they may face • There are a few medical datasets in R packages, but widely scattered • Many datasets are poorly documented, hard to understand • Quite bare-bones • It is really convenient to have datasets wrapped in one package • Easier for students and for instructors • Can re-use datasets across teaching concepts. • {medicaldata} is focused on patient-level research data, which complements the system-level data in {NHSRdatasets}

Slide 3

Slide 3 text

{medicaldata} • A data package with 15 (for now) medical datasets

Slide 4

Slide 4 text

Github • The code can be found on GitHub at: https://github.com/higgi13425/medicaldata

Slide 5

Slide 5 text

Contents • Two historical reconstructions of datasets • 1747 scurvy trial on the HMS Salisbury by James Lind • 1948 MRC streptomycin for tuberculosis trial • Five other RCTs • Sulindac for polyps, indomethacin for post-ERCP pancreatitis, others • Six Cohort & Case-Control studies • COVID testing, esophageal cancer case-control, CMV after BMT, others • Two pharmacokinetic studies • Indomethacin and Theophylline

Slide 6

Slide 6 text

Documentation • Definitions & details on each variable, units, range, levels • Background on each study • Description of study design, intervention, measurements • Specification of study outcomes • Some suggestions for uses of each dataset • Full help(dataset) files • Linked codebooks and description documents on the packagedown website & github README

Slide 7

Slide 7 text

Website • Packagedown website at https://higgi13425.github.io/medicaldata/

Slide 8

Slide 8 text

Two Asks, One Plan, and one Give 1. Add Examples 2. Please Donate Datasets 3. Plan – add Untidy datasets

Slide 9

Slide 9 text

1st Ask: Try it out – use {medicaldata} for teaching, add examples as issues • Used strep_tb dataset to teach table construction with {gtsummary} • Attach code (reprex) • Used scurvy dataset for categorical scatterplots of outcomes • Attach code (reprex) • Used the indo_rct dataset to make a covariate forest plot • Attach code (reprex) • Used theoph dataset for GAM modeling • Attach code (reprex) I would like to turn your examples Into vignettes Avishai Tsur: https://avishaitsur.netlify.app/posts/2021-09-04- reproducing-the-results-of-an-rct/

Slide 10

Slide 10 text

2nd Ask: Donate datasets • Do you have access to medical datasets? • Randomized controlled trials • Cohort studies • Case-control studies • Must be of reasonable size (5MB limit on CRAN) • Must be anonymized • Fake names, fake study IDs are helpful • Need a reasonable level of documentation/codebook/a publication I am adding several from Frank Harrell for the January 2022 release

Slide 11

Slide 11 text

Future Plan: Add some untidy medical datasets • Wide medical data that need pivot_longer() • Untidy medical data that need help from {tidyr} • Separate, unite • Separate_rows • Nest, unnest • Fill, complete, replace_na • Color-coded medical data that need {tidyxl} • Multiheaded medical data that need {unheadr} • Messy medical data that need {unpivotr} • Feel free to donate some untidy messes/examples! Likely for the July 2022 release

Slide 12

Slide 12 text

One Give • {medicaldata} hex stickers to the first 10 people who send a DM • Include your snail mail address Sender (your name and address) 123 Data Street Medical Center City, State, Country, Postal code On Twitter @ibddoctor Important – must be in one of 180 allowed countries: https://bit.ly/3vtPWnf Roughly corresponds to FIFA membership

Slide 13

Slide 13 text

CRAN Update • {medicaldata} is available on CRAN as of 16 August, 2021 • You can now install.packages(“medicaldata”) • Plan for updates ~ q6m (gradual changes to dev version) • Thanks for your feedback and github issues!

Slide 14

Slide 14 text

Thank You! Please ask questions, provide feedback, and discuss in the chat!