Science for Social Good Fellowship 2018 | dssgfellowship.org Partner A non-profit national network of more than 40 community health centers serving the least resourced members of their communities
Science for Social Good Fellowship 2018 | dssgfellowship.org 40-70 years old BMI ≥ 25 How well do the screening guidelines do? ✓ ✓ Meet the criteria of the guidelines Patient ~50%
Science for Social Good Fellowship 2018 | dssgfellowship.org External American Community Survey (ACS) Data Demographics Visits Medications Lab results Diagnoses De-identified EHR • 1.1 million patients • 24 health centers • ~ 8 million visits
Science for Social Good Fellowship 2018 | dssgfellowship.org Label Data start date 01/03/2006 Data End date 06/15/2018 Label = True Develop diabetes within the next 3 years 3 years Visit date 01/01/2014 Diagnosis date 01/01/2016 Window end date 12/31/2017
Science for Social Good Fellowship 2018 | dssgfellowship.org Features - Age at visit - BMI at visit - Sex - Race - Family history of T2D - De-identified HC location - Smoking status - Blood Pressure (SBP, DBP, Categorical) - Hospitalization in prior visit - Diagnosis of comorbidities (e.g. sleep apnea) - Number of meds prescribed in the past 6 months, 1 year - Avg. BMI in the past 6 months, 1 year - Gini Index - Median Household Income by zip code Raw from Alliance External (ACS) Computed Aggregates
Science for Social Good Fellowship 2018 | dssgfellowship.org Staging, features and labels Raw data External data Cleaned data Store predictions, metrics, model ID Trained Model Train Test Predictions Metrics Technical Solution
Science for Social Good Fellowship 2018 | dssgfellowship.org Staged data Store predictions, metrics, model ID Trained Model Train Test Predictions Metrics Create a table of cross-validation time splits Split data into train/test set (for one time split) Impute, generate more features Train the model Generate predictions on the test set Calculate metrics (precision & recall @k) Store predictions and results Technical Solution
Science for Social Good Fellowship 2018 | dssgfellowship.org How well does our model do? 63% 53% 74% 15 25 Random Forest, 10000 estimators, maximum depth 10
Science for Social Good Fellowship 2018 | dssgfellowship.org Predict risk of Type II Diabetes Personalize Screening Decisions Connect to interventions and services Prevent diabetes and improve health
Science for Social Good Fellowship 2018 | dssgfellowship.org Thank you! Benjamin Ackerman Kaleigh Clary Jorge Saldivar William Wang Katy Dupre Adolfo De Unánue Rayid Ghani
Science for Social Good Fellowship 2018 | dssgfellowship.org Label Develop diabetes within the next 3 years Start date 01/03/2006 End date 06/15/2018 Visit date 01/01/2014 Label = False
Science for Social Good Fellowship 2018 | dssgfellowship.org Label Data start date 01/03/2006 Label = False Develop diabetes within the next 3 years 3 years Visit date 01/01/2014 Diagnosis date 02/01/2018 Window end date 12/31/2017
Science for Social Good Fellowship 2018 | dssgfellowship.org Label Develop diabetes within the next 3 years Start date 01/03/2006 End date 06/15/2018 Visit date 01/01/2014 Label = NULL Diagnosis date 01/01/2013
Science for Social Good Fellowship 2018 | dssgfellowship.org Label Develop diabetes within the next 3 years Start date 01/03/2006 Data end date 06/15/2018 Visit date 01/01/2016 Label = NULL Diagnosis date? 01/01/2019 3 years
Science for Social Good Fellowship 2018 | dssgfellowship.org Staging, features and labels Raw data (EHR) External data (ACS, ICD) Cleaned data Store predictions, metrics, model ID Trained Model Train Test Predictions Metrics Configs Config File (.yaml)
of people world wide 2 Associated medical conditions 4 Heart disease Kidney failure Blindness Stroke 14% of adults in US have the disease 3 (45 million) [2] Whiting DR, Guariguata L, Weil C, Shaw J. IDF diabetes atlas: global estimates of the prevalence of diabetes for 2011 and 2030. [3] Menke A, Casagrande S, Geiss L, Cowie CC. Prevalence of and Trends in Diabetes Among Adults in the United States, 1988–2012. [4] Center for Disease and Prevention. National diabetes statistic report: estimates of diabetes and its burden in the United States, 2014. Atlanta, GA: US Department of Health and Human Services, Center for Disease and Prevention, 2014.