Sepsis

A5b424d4146905962a24acd6815aeb84?s=47 Stephen Thomas
April 19, 2019
12

 Sepsis

A5b424d4146905962a24acd6815aeb84?s=128

Stephen Thomas

April 19, 2019
Tweet

Transcript

  1. Automating Sepsis Diagnosis Stephen Thomas
 BMED 6517 - Spring 2019

  2. 40 features 8 vital signs 26 lab results 6 dem

    ographic values 40,336 patients sepsis label 70% missing data 2932 sepsis positive patients 8 to 336 hourly measurements
  3. Data Preparation 40 features 8 vital signs 26 lab results

    6 dem ographic values 40,336 patients sepsis label 70% missing data 2932 sepsis positive patients 8 to 336 hourly measurements 12,237 patients 28,099 patients 2064 sepsis positive patients 30% Holdout For Testing Impute Temporal Data by Linear Interpolation/ Extrapolation Balance Training Data by Undersampling Augment Temporal Data with Rates of Change Normalize and Zero-Fill 34 tem poral variables velocity acceleration Δ(X) Δ²(X) Training Set: 199,686 observations × 108 features Test Set: 469,332 observations × 108 features 19,669 sepsis positive labels
  4. 30% Holdout 40 features 8 vital signs 26 lab results

    6 dem ographic values 40,336 patients sepsis label 70% missing data 2932 sepsis positive patients 8 to 336 hourly measurements 12,237 patients 28,099 patients 2064 sepsis positive patients
  5. Impute Temporal Data 12,237 patients 28,099 patients 2064 sepsis positive

    patients Linear Interpolation & Extrapolation
  6. Balance Training Data Undersample sepsis negative 26,035 → 2064

  7. Capture Temporal History 34 tem poral variables velocity acceleration Δ(X)

    Δ²(X)
  8. Normalize / Zero-Fill Training Set: 199,686 observations × 108 features

    Test Set: 469,332 observations × 108 features 19,669 sepsis positive labels
  9. Is Data Separable?

  10. Is Data Separable?

  11. LSTM Network Results Sepsis Negative Sepsis Positive Predicted Class True

    Class 297 1172 571 10197 0 0.2 0.4 0.6 0.8 1 False prediction rate 0 0.2 0.4 0.6 0.8 1 True prediction rate Sepsis Negative Sepsis Positive
  12. Results • All models very sensitive to overfitting • Problem

    is worthy of computational challenge Utility Notes Support Vector Machine 0.00 Gaussian kernel, other hyper-parameters had minimal effect Random Forest 0.02 Large leaf size and minimal splits to minimize overfitting Long Short-Term Memory Recurrent Neural Network 0.35 Limited epochs and relatively high learning rate to minimize overfitting