Slide 1

Slide 1 text

Machine Learning for Social Good Dr. Jorge Saldivar Barcelona Supercomputing Center (BSC) DataBeers BCN 17-12-2019 Image source: https://sites.google.com/site/icml2016data4goodworkshop/

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes Team: Benjamin Ackerman - Kaleigh Clary - Jorge Saldivar - William Wang - Katy Dupre - Adolfo De Unánue - Rayid Ghani

Slide 4

Slide 4 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Partner A non-profit national network of more than 40 community health centers serving the least resourced members of their communities

Slide 5

Slide 5 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org 23 years old Obese (BMI 30) Family history of diabetes Hypertension

Slide 6

Slide 6 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org 40-70 years old BMI ≥ 25 Federal screening guidelines

Slide 7

Slide 7 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org 40-70 years old BMI ≥ 25 How well do the screening guidelines do? ✓ ✓ Meet the criteria of the guidelines Patient ~50%

Slide 8

Slide 8 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org 40-70 years old BMI ≥ 25 Our Goal ✓ ✓

Slide 9

Slide 9 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org External American Community Survey (ACS) Data Demographics Visits Medications Lab results Diagnoses De-identified EHR ● 1.1 million patients ● 24 health centers ● ~ 8 million visits

Slide 10

Slide 10 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Defining Type II Diabetes Cases ICD Diagnoses Medication (metformin) 2 A1C Tests > 6.4 82,960 cases (7.2% of patient population) - ICD diagnoses - 2 A1c Tests > 6.4 - Metformin Rx

Slide 11

Slide 11 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Prediction Time - Patient visit Patient Id Visit Date Feature 1 Feature 2 ... Feature N 1 2015-01-01 ... ... ... ... 2 2014-12-11 ... ... ... ... 3 2013-05-10 ... ... ... ... 2 2012-06-05 ... ... ... ... 3 2011-07-05 ... ... ... ...

Slide 12

Slide 12 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Label Data start date 01/03/2006 Data End date 06/15/2018 Label = True Develop diabetes within the next 3 years 3 years Visit date 01/01/2014 Diagnosis date 01/01/2016 Window end date 12/31/2017

Slide 13

Slide 13 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Features - Age at visit - BMI at visit - Sex - Race - Family history of T2D - De-identified HC location - Smoking status - Blood Pressure (SBP, DBP, Categorical) - Hospitalization in prior visit - Diagnosis of comorbidities (e.g. sleep apnea) - Number of meds prescribed in the past 6 months, 1 year - Avg. BMI in the past 6 months, 1 year - Gini Index - Median Household Income by zip code Raw from Alliance External (ACS) Computed Aggregates

Slide 14

Slide 14 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Staging, features and labels Raw data External data Cleaned data Store predictions, metrics, model ID Trained Model Train Test Predictions Metrics Technical Solution

Slide 15

Slide 15 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Staged data Store predictions, metrics, model ID Trained Model Train Test Predictions Metrics Create a table of cross-validation time splits Split data into train/test set (for one time split) Impute, generate more features Train the model Generate predictions on the test set Calculate metrics (precision & recall @k) Store predictions and results Technical Solution

Slide 16

Slide 16 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org How well do the guidelines do? 53%

Slide 17

Slide 17 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org How well does our model do? 63% Random Forest, 10000 estimators, maximum depth 10

Slide 18

Slide 18 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org How well does our model do? 63% 53% 15 Random Forest, 10000 estimators, maximum depth 10

Slide 19

Slide 19 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org How well does our model do? 63% 53% 74% 15 25 Random Forest, 10000 estimators, maximum depth 10

Slide 20

Slide 20 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Predict risk of Type II Diabetes Personalize Screening Decisions Connect to interventions and services Prevent diabetes and improve health

Slide 21

Slide 21 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Thank you! Benjamin Ackerman Kaleigh Clary Jorge Saldivar William Wang Katy Dupre Adolfo De Unánue Rayid Ghani

Slide 22

Slide 22 text

Appendix

Slide 23

Slide 23 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Label Develop diabetes within the next 3 years Start date 01/03/2006 End date 06/15/2018 Visit date 01/01/2014 Label = False

Slide 24

Slide 24 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Label Data start date 01/03/2006 Label = False Develop diabetes within the next 3 years 3 years Visit date 01/01/2014 Diagnosis date 02/01/2018 Window end date 12/31/2017

Slide 25

Slide 25 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Label Develop diabetes within the next 3 years Start date 01/03/2006 End date 06/15/2018 Visit date 01/01/2014 Label = NULL Diagnosis date 01/01/2013

Slide 26

Slide 26 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Label Develop diabetes within the next 3 years Start date 01/03/2006 Data end date 06/15/2018 Visit date 01/01/2016 Label = NULL Diagnosis date? 01/01/2019 3 years

Slide 27

Slide 27 text

Total visits USPSTF visits screening recommendation (20%) T2D cases detected (53%) DSSG visits screening recommendation (20%) T2D cases detected (63%) Label Prediction Score 1 0.93 1 0.87 0 0.81 1 0.79 0 0.77 ... ... 0 0.21 0 0.15 0 0.09 k% most probable

Slide 28

Slide 28 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Train Set Test Set Row Span Timestamp of first row Timestamp of last row Label Span Timestamp of first label Timestamp of last label Row Span Label Span 2006-01-01 - 2006-12-31 2006-01-01 2006-12-31 2006-01-01 - 2009-12-31 2006-01-01 2009-12-31 2010-01-01 - 2010-12-31 2010-01-01 - 2013-12-31 2007-01-01 - 2007-12-31 2007-01-01 2007-12-31 2007-01-01 - 2010-12-31 2007-01-01 2010-12-31 2011-01-01 - 2011-12-31 2011-01-01 - 2014-12-31 2008-01-01 - 2008-12-31 2008-01-01 2008-12-31 2008-01-01 - 2011-12-31 2008-01-01 2011-12-31 2012-01-01 - 2012-12-31 2012-01-01 - 2015-12-31 2009-01-01 - 2009-12-31 2009-01-01 2009-12-31 2009-01-01 - 2012-12-31 2009-01-01 2012-12-31 2013-01-01 - 2013-12-31 2013-01-01 - 2016-12-31 2010-01-01 - 2010-12-31 2010-01-01 2010-12-31 2010-01-01 - 2013-12-31 2010-01-01 2013-12-31 2014-01-01 - 2014-12-31 2014-01-01 - 2017-12-31 Train/Test Splits

Slide 29

Slide 29 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org

Slide 30

Slide 30 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Time from Initial Visit to T2D Diagnosis 42% of T2D cases!

Slide 31

Slide 31 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Staging, features and labels Raw data (EHR) External data (ACS, ICD) Cleaned data Store predictions, metrics, model ID Trained Model Train Test Predictions Metrics Configs Config File (.yaml)

Slide 32

Slide 32 text

Type 2 diabetes, a serious public health problem 410 millions of people world wide 2 Associated medical conditions 4 Heart disease Kidney failure Blindness Stroke 14% of adults in US have the disease 3 (45 million) [2] Whiting DR, Guariguata L, Weil C, Shaw J. IDF diabetes atlas: global estimates of the prevalence of diabetes for 2011 and 2030. [3] Menke A, Casagrande S, Geiss L, Cowie CC. Prevalence of and Trends in Diabetes Among Adults in the United States, 1988–2012. [4] Center for Disease and Prevention. National diabetes statistic report: estimates of diabetes and its burden in the United States, 2014. Atlanta, GA: US Department of Health and Human Services, Center for Disease and Prevention, 2014.

Slide 33

Slide 33 text

Supporting Proactive Diabetes Screenings to Improve Health Outcomes | Data Science for Social Good Fellowship 2018 | dssgfellowship.org Start date: 01/01/2006 End date: 06/15/2018 Train Test Label Span Label Span Train Test Label Span Label Span Train Test Label Span Label Span Train Test Label Span Label Span Train Test Label Span Label Span Train/Test Splits