Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sepsis

Stephen Thomas
April 19, 2019
30

 Sepsis

Stephen Thomas

April 19, 2019
Tweet

Transcript

  1. Automating Sepsis
    Diagnosis
    Stephen Thomas

    BMED 6517 - Spring 2019

    View Slide

  2. 40 features
    8
    vital signs
    26
    lab
    results
    6
    dem
    ographic
    values
    40,336
    patients
    sepsis
    label
    70% missing data
    2932
    sepsis
    positive
    patients
    8 to 336 hourly
    measurements

    View Slide

  3. Data Preparation
    40 features
    8
    vital signs
    26
    lab
    results
    6
    dem
    ographic
    values
    40,336
    patients
    sepsis
    label
    70% missing data
    2932
    sepsis
    positive
    patients
    8 to 336 hourly
    measurements
    12,237 patients
    28,099 patients
    2064
    sepsis
    positive
    patients
    30% Holdout
    For Testing
    Impute Temporal
    Data by Linear
    Interpolation/
    Extrapolation
    Balance
    Training Data by
    Undersampling
    Augment
    Temporal Data
    with Rates of
    Change
    Normalize and
    Zero-Fill
    34
    tem
    poral
    variables
    velocity
    acceleration
    Δ(X) Δ²(X)
    Training Set: 199,686 observations × 108 features
    Test Set: 469,332 observations × 108 features
    19,669
    sepsis
    positive
    labels

    View Slide

  4. 30% Holdout
    40 features
    8
    vital signs
    26
    lab
    results
    6
    dem
    ographic
    values
    40,336
    patients
    sepsis
    label
    70% missing data
    2932
    sepsis
    positive
    patients
    8 to 336 hourly
    measurements
    12,237 patients
    28,099 patients
    2064
    sepsis
    positive
    patients

    View Slide

  5. Impute Temporal Data
    12,237 patients
    28,099 patients
    2064
    sepsis
    positive
    patients
    Linear
    Interpolation &
    Extrapolation

    View Slide

  6. Balance Training Data
    Undersample
    sepsis negative
    26,035 → 2064

    View Slide

  7. Capture Temporal History
    34
    tem
    poral
    variables
    velocity
    acceleration
    Δ(X) Δ²(X)

    View Slide

  8. Normalize / Zero-Fill
    Training Set: 199,686 observations × 108 features
    Test Set: 469,332 observations × 108 features
    19,669
    sepsis
    positive
    labels

    View Slide

  9. Is Data Separable?

    View Slide

  10. Is Data Separable?

    View Slide

  11. LSTM Network Results
    Sepsis Negative Sepsis Positive
    Predicted Class
    True Class
    297
    1172
    571
    10197
    0 0.2 0.4 0.6 0.8 1
    False prediction rate
    0
    0.2
    0.4
    0.6
    0.8
    1
    True prediction rate
    Sepsis Negative
    Sepsis Positive

    View Slide

  12. Results
    • All models very sensitive to overfitting

    • Problem is worthy of computational challenge
    Utility Notes
    Support Vector Machine 0.00
    Gaussian kernel, other hyper-parameters had
    minimal effect
    Random Forest 0.02
    Large leaf size and minimal splits to minimize
    overfitting

    Long Short-Term Memory
    Recurrent Neural Network
    0.35
    Limited epochs and relatively high learning rate to
    minimize overfitting

    View Slide