Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Noise Pollution in Hospital Readmission Predict...

Noise Pollution in Hospital Readmission Prediction: Long Document Classification with Reinforcement Learning

Emory NLP

July 08, 2021
Tweet

More Decks by Emory NLP

Other Decks in Technology

Transcript

  1. Task 30-day hospital readmission prediction after kidney transplant using clinical

    notes. Document classification problem without using specific domain knowledge.
  2. Dataset Emory Kidney Transplant Dataset (EKTD) • 2060 patients; 3:7

    positive-negative • 8 types of unstructured clinical notes • Zero-to-many available notes for each patient of each type
  3. Challenges • Small sample size ◦ Mostly fewer than 2000

    patients/labels • Long documents ◦ 10k+ tokens • Noisy documents ◦ Tabular text ◦ Task-irrelevant sentences Effective Representation Noise-Awareness
  4. Noisy Text Tabular data, e.g. lab results, prescription Lab Fishbone

    (BMP, CBC, CMP, Diff) and critical labs - Last 24 hours (Not an official lab report. Please see flowsheet (or printed official lab reports) for official lab results.) 07/20/2013 03:25 ~ 07/20/2013 03:25 146H(Na) 110(Cl) 16(BUN) ~ 10.6L(Hgb) -----|-----|-----<108(Glu) 5.3(WBC)>-------<178(Plt) 3.6(K) 29(CO2) 5.83H(Cr) ~ 34.6L(Hct) 07/20/2013 03:25 Ca 9.7 07/20/2013 03:25 ~~~~~ALP ~~~~~ALT ~~~~~AST ~~~~~~Bili ~~~~~~Prot ~~~~~~ALB -----|-----|-----|-----|-----|-----? ~~~~~54 ~~~~12 ~~~~13L ~~~~0.7 ~~~~6.4 ~~~~3.6 (c) = Corrected C = Critical H = High L = Low NA = Not applicable A = Abnormal (ftn) = footnote
  5. Statistics Note Type # Patients # Tokens Descriptions Consultations (CO)

    1354 4395.3 Report for every outpatient consultation before transplantation Discharge Summary (DS) 514 1296.7 Summary at the time of discharge from every hospital admission happened before transplant Echocardiography (EC) 1110 1073.6 Results of echocardiography History and Physical (HP) 1422 3025.1 Summary of the patient’s medical history and clinical examination Operative (OP) 1472 4224.8 Report of surgical procedures Progress (PG) 1415 13723.4 Medical note during hospitalization summarizing the patient’s medical status each day Selection Conference (SC) 2033 1189.2 Report from the evaluation of each transplant candidate by the selection committee Social Worker (SW) 1118 1407.6 Report from encounters with social workers
  6. Approach Encoders: 1. Bag-of-Words (BoW) 2. Averaged word embedding 3.

    Deep-learning encoders a. Transformers-based: ClinicalBERT (Huang et al., 2019) b. RNN-based: Bi-LSTM Same training objective: minimize negative log-likelihood of gold labels Baseline Overfitting
  7. Encoders 3. ClinicalBERT (Huang et al., 2019) • For each

    patient, split notes into independent segments • Each segment uses the same label as patient • Ensemble predictions from segments Introduce More Noise
  8. Encoders 4. Bi-LSTM • For each patient, split notes into

    short segments • Represent each segment by averaged word embedding • Run Bi-LSTM over segment representation for each patient • With weight-dropped (Merity et al., 2018)
  9. Reinforcement Learning Objective: automatic noise pruning Assumption: reduce feature space

    to alleviate overfitting Method: model the pruning process as a sequential decision problem on segment level (align with the fact that clinical documents are received in time-order)
  10. Reinforcement Learning Components: best performed encoder + policy gradient Episode:

    a sequence of segments of the patient State: previously selected segments + current segment Action: {keep, prune} Reward: log-likelihood of gold label using final selected segments
  11. Reinforcement Learning Encourage pruning: provide additional reward in proportion to

    pruning ratio Add entropy regularization (Mnih et al., 2016):
  12. Experiments Evaluation metric: Area Under the Curve (AUC) Randomly split

    5 folds, report average. Encoder CO DS EC HP OP PG SC SW BoW 58.6 62.1 52.0 58.9 51.8 61.2 59.3 51.6 + cutoff at 2 58.6 62.3 52.8 59.0 51.9 61.3 59.3 51.9 + stemming 58.9 61.8 53.4 59.4 51.9 61.5 59.3 51.6 Avg. emb 56.3 53.7 52.4 54.0 53.4 54.7 54.2 46.6 ClinicalBERT 51.9 53.3 - 52.7 - - 52.3 - LSTM 53.7 55.8 - 54.2 - - 54.5 -
  13. Experiments Verify our noise observation: feature space reduction on BoW

    Type Vanilla + Cutoff + Stemming CO 28213 15022 (46.8%) 12243 (56.6%) DS 11029 6117 (44.5%) 5228 (52.6%) HP 20245 11276 (44.3%) 9329 (53.9%) SC 19050 9873 (48.2%) 8200 (57.0%)
  14. Experiments Performance of reinforcement learning: CO DS HP SC Best

    58.9 62.3 59.4 59.3 RL 59.8 62.4 60.6 60.2 Pruning Ratio 26% 5% 19% 23%
  15. Experiments RL tuning essentials: 1. Reward discount rate: keep the

    scale of policy gradient stable 2. Entropy regularization: avoid local optima
  16. Summary • The old bag-of-words is still a strong encoder

    for this dataset with long text and small sample size. • Deep learning experiences strong overfitting for this dataset. • RL is able to further improve performance, while doing automatic noise pruning. • RL is able to identify two types of noise: typical noisy tokens, and task-specific noisy text.
  17. References Kexin Huang, Jaan Altosaar, and Rajesh Ranganath. 2019. Clinicalbert:

    Modeling clinical notes and predicting hospital readmission. Stephen Merity, Nitish Shirish Keskar, and Richard Socher. 2018. Regularizing and optimizing LSTM language models. Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning.