Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The {medicaldata} package at NHS-R 2021

The {medicaldata} package at NHS-R 2021

The medicaldata package is an R package for teaching medical data analysis in R with patient level data from clinical trials, cohort studies, and case-control studies.


Peter Higgins

November 09, 2021


  1. The {medicaldata} Teaching Package Peter D.R. Higgins University of Michigan,

    Ann Arbor, Michigan, USA @ibddoctor $1.62B/yr research grants
  2. Why a medical data package for teaching? • Learners find

    relevant examples motivating • These datasets illustrate data challenges they may face • There are a few medical datasets in R packages, but widely scattered • Many datasets are poorly documented, hard to understand • Quite bare-bones • It is really convenient to have datasets wrapped in one package • Easier for students and for instructors • Can re-use datasets across teaching concepts. • {medicaldata} is focused on patient-level research data, which complements the system-level data in {NHSRdatasets}
  3. {medicaldata} • A data package with 15 (for now) medical

  4. Github • The code can be found on GitHub at:

  5. Contents • Two historical reconstructions of datasets • 1747 scurvy

    trial on the HMS Salisbury by James Lind • 1948 MRC streptomycin for tuberculosis trial • Five other RCTs • Sulindac for polyps, indomethacin for post-ERCP pancreatitis, others • Six Cohort & Case-Control studies • COVID testing, esophageal cancer case-control, CMV after BMT, others • Two pharmacokinetic studies • Indomethacin and Theophylline
  6. Documentation • Definitions & details on each variable, units, range,

    levels • Background on each study • Description of study design, intervention, measurements • Specification of study outcomes • Some suggestions for uses of each dataset • Full help(dataset) files • Linked codebooks and description documents on the packagedown website & github README
  7. Website • Packagedown website at https://higgi13425.github.io/medicaldata/

  8. Two Asks, One Plan, and one Give 1. Add Examples

    2. Please Donate Datasets 3. Plan – add Untidy datasets
  9. 1st Ask: Try it out – use {medicaldata} for teaching,

    add examples as issues • Used strep_tb dataset to teach table construction with {gtsummary} • Attach code (reprex) • Used scurvy dataset for categorical scatterplots of outcomes • Attach code (reprex) • Used the indo_rct dataset to make a covariate forest plot • Attach code (reprex) • Used theoph dataset for GAM modeling • Attach code (reprex) I would like to turn your examples Into vignettes Avishai Tsur: https://avishaitsur.netlify.app/posts/2021-09-04- reproducing-the-results-of-an-rct/
  10. 2nd Ask: Donate datasets • Do you have access to

    medical datasets? • Randomized controlled trials • Cohort studies • Case-control studies • Must be of reasonable size (5MB limit on CRAN) • Must be anonymized • Fake names, fake study IDs are helpful • Need a reasonable level of documentation/codebook/a publication I am adding several from Frank Harrell for the January 2022 release
  11. Future Plan: Add some untidy medical datasets • Wide medical

    data that need pivot_longer() • Untidy medical data that need help from {tidyr} • Separate, unite • Separate_rows • Nest, unnest • Fill, complete, replace_na • Color-coded medical data that need {tidyxl} • Multiheaded medical data that need {unheadr} • Messy medical data that need {unpivotr} • Feel free to donate some untidy messes/examples! Likely for the July 2022 release
  12. One Give • {medicaldata} hex stickers to the first 10

    people who send a DM • Include your snail mail address Sender (your name and address) 123 Data Street Medical Center City, State, Country, Postal code On Twitter @ibddoctor Important – must be in one of 180 allowed countries: https://bit.ly/3vtPWnf Roughly corresponds to FIFA membership
  13. CRAN Update • {medicaldata} is available on CRAN as of

    16 August, 2021 • You can now install.packages(“medicaldata”) • Plan for updates ~ q6m (gradual changes to dev version) • Thanks for your feedback and github issues!
  14. Thank You! Please ask questions, provide feedback, and discuss in

    the chat!