Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine learning with ventilator data to improve reporting on critically ill newborn infants

Machine learning with ventilator data to improve reporting on critically ill newborn infants

Our efforts to date as an open collaboration between the NHS and ModelInsight to use machine learning to label start-of-breaths in baby ventilator time series data. Presented at PyDataLondon 2017 conference.



May 08, 2017

More Decks by ianozsvald

Other Decks in Science


  1. Machine learning with ventilator data to improve reporting on critically

    ill newborn infants Gusztav Belteki, Giles Weaver, Ian Ozsvald PyDataLondon 2017
  2.  Gusztav Belteki - Consultant Neonatologist in Cambridge, Interested in

    neonatal ventilation and data analysis with Python  Ian Ozsvald – Long-time Pythonista, PyDataLondon co- founder, ML consultant, author  Giles Weaver – Bioinformatician turned Python data scientist, Review Committee member  This collaboration builds on our PyData London January talk
  3. Why do some newborn babies require mechanical ventilation ? •

    Prematurity: lung, muscles and brain are too immature to support adequate gas exchange • Full-term babies may require intensive care (e.g. infection, after an operation, birth depression etc.) • We have >1500 “ventilator days” yearly • Ventilation is also an important part of paediatric and adult intensive care
  4. Ventilator in 1985 Ventilator in 2015

  5. Two respiratory pumps • During spontaneous breathing ventilation is an

    ACTIVE process with the patient using negative pressures • Mechanical ventilation during general anaesthesia is a PASSIVE process with the ventilator using positive pressures • During neonatal intensive care patient do not receive full sedation or relaxation: ventilation is the combination and superimposition of these two pumps
  6. “Clean ventilator breath” Patient-ventilator interaction: inspiration during expiratory phase

  7. Patient not breathing Patient breathing

  8. Data Collection • Perfomed a service evaluation of ventilation on

    the neonatal intensive care unit • Downloaded ~160 days of ventilator data from 59 ventilated neonates • Most recordings are >24 hours, usually 2-4 days • Time series data, sampling rate is 100 Hz (every 10 msec) • Data are retrieved as csv files • Generates approximately 650 Mbyte data / 24 hours of ventilation (1 ventilator day)
  9. Flow (L/min) Pressure (mBar) Artificial lung Sedated patient Breathing patient

    Mechanical ventilation is always a complex physical process due to interaction between the ventilator and the patient 3 hours of ventilation (~1,000,000 data points)
  10. Aims (clinical) • To provide the clinician with SIMPLE and

    QUANTITATIVE indicators ventilator-patient interactions However… …this requires looking at individual breaths in isolation …which would require ventilator data to be split into individual breaths… …that is not feasible to do manually on a longer trace
  11. Aims (data science) • Automatically segment each breath (43,200 per

    night) – we have a working prototype • Summarisation of breathing statistics only possible if we’ve segmented them • Calculate “auto-PEEP” - a harmful condition for the baby • (future) Begin to classify patient-initiated or ventilator-initiated breaths and other ideas once we have segmentations in place
  12. Summarising a set of breaths • If we can segment

    the breaths – what do we want to see? Ventilator delivers backup inflations if baby does not breathe for some time Breaths triggered by the baby are regularly spaced Different levels of ventilator contribution
  13. Summarising a set of breaths • Having segmented breaths –

    we can show ‘what happened in a period of time’ - descriptive statistics of ventilator/patient interactions
  14. Project phases • Get CSVs, check they’re sane • Fix

    data issues (timestamps #sigh) • Exploratory Data Analysis • Hand-building a Gold Standard for ML • Simple many-moving-averages “classifier” • Use of Random Forests and building up features for improved ML • Review with Dr. Belteki
  15. Our technical approach • Notebooks, git, gitter • Module-ised Notebooks

    (see github)  Enables tool reuse • Thanking:
  16. Annotating breaths by hand • Giles annotated 20 minutes of

    4 traces using a custom Bokeh + ipywidgets tool
  17. Random Forests and classifications • Used a 75% train &

    25% test split per patient on 5 minutes of data (100Hz), 1 positive sample per 100 samples (approximately) • Built up a set of features that solved the problem reliably for most patients – rates of change and short-history indicators • Developed a GUI diagnostic tool
  18. Correctly identified breaths • True positives have a consistent shape

    • High predict_proba at the correct location Heavily sedated patient, all work done by the ventilator
  19. The interesting failures • False negatives (and positives) have some

    odd shapes, note the double predict_proba indications Conscious patient, they have strong breathing effort, ventilator has some contribution We see delayed triggering of the ventilator back-up breath in the blue example
  20. The GUI Tools • Bokeh in Notebook very useful for

    annotating timeseries sections • Matplotlib used for ML diagnosis (last 2 slides) • Used Notebook widgets (ipywidgets) • Had to ask for a new feature (Notebook team responsive – thanks!) • If you wrote these from scratch you’d probably want to put aside several days
  21. Next steps • The Random Forest model does pretty well

    and we can explain why it works • Trained models generalise over time, we haven’t tested how well they generalise over patients • Next we’ll validate if we’re “good enough” to start work on auto-PEEP calcuations
  22. Your experience? • Have you tackled timeseries event classification in

    different ways? • Do you have experience with recurrent neural networks (or similar recurrent approach) that might work? • We’re very open to feedback and ideas!
  23. Acknowledgments: • Professor Colin Morley for advice • Draeger Medical

    for providing the ventilator data download tool • All the doctors and nurses of Cambridge Neonatal Intensive Care Unit • ModelInsight and Endava’s financial support
  24. Contact • IanOzsvald.com - @IanOzsvald • Gusztav – github.com/Ventilator-Python •

    Giles – linkedin.com/in/gilesweaver Previous talks: https://www.meetup.com/PyData-London-Meetup/events/236523797/ http://2016.pyconuk.org/talks/python-in-medicine-ventilator-data/