Deep Learning COVID-19 on CXR using Limited Training Data Sets

Deep Learning COVID-19 on CXR using Limited Training Data Sets

Imaging AI based Management of COVID-19 Webinar Series, May 14t-15th, 2020

A3d61bc22cd700a92e7d4136a4d29e8f?s=128

Jong Chul Ye

May 14, 2020
Tweet

Transcript

  1. Deep Learning COVID-19 on CXR using Limited Training Data Sets

    (Oh et al, IEEE TMI, 2020) Jong Chul Ye, Ph.D joint work with Yujin Oh & Sangjoon Park Professor, FIEEE BISPL - BioImaging, Signal Processing, and Learning lab. Dept. Bio & Brain Engineering Dept. Mathematical Sciences KAIST, Korea
  2. COVID-19 Pandemic • 4 Million confirmed cases • 282,000 deaths

    (May 11th, 2020)
  3. Early diagnosis of COVID-19 Diagnosis RT-PCR Chest CT Chest X-ray

    RT-PCR Chest CT Chest X-ray Time > 6 hours > 30 min < 5 min Sensitivity (%) >90 >90 69 Cost $ $$$ $ Fang et al.; Wong et al., Radiology, 2020 Reverse transcription polymerase chain reaction Allplex™ 2019-nCoV Assay, Seegene Inc. Fang et al., Radiology, 2020 Wong et al., Radiology, 2020 CXR abnormalities were detectable in 9% of patients whose initial RT-PCR was negative.
  4. Deep Learning COVID-19 on CXR Chest X-ray (CXR) Classification Sensitivity

    (%) [COVID-Net] Wang et al., arXiv, 2020 Normal / Non-COVID19 / COVID-19 91 Hemdan et al., arXiv, 2020 Normal / COVID-19 100 Narin et al., arXiv, 2020 Normal / COVID-19 96 [COVID-Net] Wang et al., arXiv, 2020 Not a good category: merge bacterial/viral pneumonia into Non-COVID19 However, viral pneumonia (e.g. SARS-cov or MERS-cov) is similar to COVID-19 even for experienced radiologists (Yoon et al, KJR, 2020)
  5. Deep Learning CXR as Patient Triage Brown et al., The

    Lancet, 1998 Cause Frequency (%) 1 Streptococcus pneumonia (Bacterial) 15 − 42 2 Haemophilus influenza (Bacterial) 11 − 12 3 Viral pneumonia 8 − 13 4 Tuberculosis < 10 Potential Triage with CXR-AI • Exclude Normal, Tuberculosis, and Bacterial pneumonia at the early stage • RT-PCR or Chest CT for only Viral pneumonia à To save medical resources Common cause of Pneumonia • Bacterial pneumonia is the most common • Tuberculosis (TB) (depending on the geological region) • Viral pneumonia is relatively less common (SARS-Cov, MERS-Cov, COVID-19)
  6. Technical Issues for CXR-AI

  7. Limited Training Data Sets Narin et al., arXiv, 2020 Hemdan

    et al., arXiv, 2020 Unbalanced age-distribution due to limited public pneumonia dataset Guangzhou Women and Children’s Medical Center https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia No well-curated COVID-19 dataset Cohen et al., arXiv, 2020 Online publications Website
  8. Limited Training Data Sets [22] Chest X-Ray-Dataset. https://www.kaggle.com/praveengovi/coronahack-chest-xraydataset [23] COVID-19

    image data collection. https://github.com/ieee8023/covid-chestxray-dataset Exclude potential bias for training • Exclude pediatric CXR • Universal preprocessing for data heterogeneity normalization Small number of training data sets
  9. Imaging Biomarkers

  10. Potential COVID-19 Biomarkers • Lung morphology • Mean lung intensity

    • Standard deviation of lung intensity • Cardiothoracic Ratio (CTR) *** Common CXR findings on COVID-19 • Multifocal patchy consolidation & Ground-glass opacity (GGO) Fang et al.; Ai et al.; Wong et al., Radiology, 2020 Correspond to • Hypothesis : CXR appearance of COVID-19 influence on intensity-related biomarker
  11. • Inter-patch distribution • Mean intensity of each patch •

    Multi-focally distributed consolidation • Intra-patch distribution • STD of each patch intensity histogram • Local texture information More informative discriminating feature for COVID-19 Potential COVID-19 Biomarkers
  12. Our Approach: Patch-based Classification

  13. • Mitigate data heterogeneity, bias, and overfitting during network training

    • Normalization of Data Heterogeneities Pre-processing
  14. Data Normalization Normal COVID-19 Segmented Heart & Lung intensity Intensity

    histogram • Normalize data-driven heterogeneities and bias
  15. Segmentation • Normalize data-driven heterogeneities and bias Normal COVID-19 Segmented

    Heart & Lung intensity Intensity histogram 87%
  16. Classification by Majority Voting

  17. Normal/Bacterial/TB/COVID Classifier Performance • Global approach • Patch-based approach

  18. Robustness to Small Dataset Size • Segmentation on normalized CXR

    • Patch-based classification
  19. Patch-based Training Stability • Global approach • Patch-based approach •

    No sign of overfitting even with small training data set overfitting overfitting
  20. Comparison with SOTA • COVID-Net: Wang et al, arXiv:2003.09871, 2020.

    • Limitation: merge bacterial/viral pneumonia into 1 class • Weight: 11.6M • COVID-19 Sensitivity: 100% • Weight: 116.6M • COVID-19 Sensitivity: 80% Ours COVID-Net
  21. Interpretability

  22. Saliency Map for Interpretability • Grad-CAM Selvaraju et al., Proc.

    of IEEE on CV, 2017 ü Powerful method for visualizing CNN. ü Not suitable for multiple object visualization. Grad-CAM COVID-19 • Probabilistic Grad-CAM (proposed) ü Integrating probability weights for each patch ü Suitable for patch processing ü Able to visualize multiple lesions. Prob. Grad-CAM
  23. Probabilistic Grad-GAM Normal Bacterial Tuberculosis COVID-19

  24. Summary Patient Triage Robust classification & interpretability Under limited data

    set
  25. Acknowledgement • Grant: NRF of Korea no. 2020R1A2B5B03001980