Machine learning with ventilator data to improve reporting on critically ill newborn infants

Machine learning with ventilator data to improve reporting on critically
ill newborn infants Gusztav Belteki, Giles Weaver, Ian Ozsvald PyDataLondon 2017

 Gusztav Belteki - Consultant Neonatologist in Cambridge, Interested in
neonatal ventilation and data analysis with Python  Ian Ozsvald – Long-time Pythonista, PyDataLondon co- founder, ML consultant, author  Giles Weaver – Bioinformatician turned Python data scientist, Review Committee member  This collaboration builds on our PyData London January talk

Why do some newborn babies require mechanical ventilation ? •
Prematurity: lung, muscles and brain are too immature to support adequate gas exchange • Full-term babies may require intensive care (e.g. infection, after an operation, birth depression etc.) • We have >1500 “ventilator days” yearly • Ventilation is also an important part of paediatric and adult intensive care

Ventilator in 1985 Ventilator in 2015

Two respiratory pumps • During spontaneous breathing ventilation is an
ACTIVE process with the patient using negative pressures • Mechanical ventilation during general anaesthesia is a PASSIVE process with the ventilator using positive pressures • During neonatal intensive care patient do not receive full sedation or relaxation: ventilation is the combination and superimposition of these two pumps

“Clean ventilator breath” Patient-ventilator interaction: inspiration during expiratory phase

Patient not breathing Patient breathing

Data Collection • Perfomed a service evaluation of ventilation on
the neonatal intensive care unit • Downloaded ~160 days of ventilator data from 59 ventilated neonates • Most recordings are >24 hours, usually 2-4 days • Time series data, sampling rate is 100 Hz (every 10 msec) • Data are retrieved as csv files • Generates approximately 650 Mbyte data / 24 hours of ventilation (1 ventilator day)

Flow (L/min) Pressure (mBar) Artificial lung Sedated patient Breathing patient
Mechanical ventilation is always a complex physical process due to interaction between the ventilator and the patient 3 hours of ventilation (~1,000,000 data points)

Aims (clinical) • To provide the clinician with SIMPLE and
QUANTITATIVE indicators ventilator-patient interactions However… …this requires looking at individual breaths in isolation …which would require ventilator data to be split into individual breaths… …that is not feasible to do manually on a longer trace

Aims (data science) • Automatically segment each breath (43,200 per
night) – we have a working prototype • Summarisation of breathing statistics only possible if we’ve segmented them • Calculate “auto-PEEP” - a harmful condition for the baby • (future) Begin to classify patient-initiated or ventilator-initiated breaths and other ideas once we have segmentations in place

Summarising a set of breaths • If we can segment
the breaths – what do we want to see? Ventilator delivers backup inflations if baby does not breathe for some time Breaths triggered by the baby are regularly spaced Different levels of ventilator contribution

Summarising a set of breaths • Having segmented breaths –
we can show ‘what happened in a period of time’ - descriptive statistics of ventilator/patient interactions

Project phases • Get CSVs, check they’re sane • Fix
data issues (timestamps #sigh) • Exploratory Data Analysis • Hand-building a Gold Standard for ML • Simple many-moving-averages “classifier” • Use of Random Forests and building up features for improved ML • Review with Dr. Belteki

Our technical approach • Notebooks, git, gitter • Module-ised Notebooks
(see github)  Enables tool reuse • Thanking:

Annotating breaths by hand • Giles annotated 20 minutes of
4 traces using a custom Bokeh + ipywidgets tool

Random Forests and classifications • Used a 75% train &
25% test split per patient on 5 minutes of data (100Hz), 1 positive sample per 100 samples (approximately) • Built up a set of features that solved the problem reliably for most patients – rates of change and short-history indicators • Developed a GUI diagnostic tool

Correctly identified breaths • True positives have a consistent shape
• High predict_proba at the correct location Heavily sedated patient, all work done by the ventilator

The interesting failures • False negatives (and positives) have some
odd shapes, note the double predict_proba indications Conscious patient, they have strong breathing effort, ventilator has some contribution We see delayed triggering of the ventilator back-up breath in the blue example

The GUI Tools • Bokeh in Notebook very useful for
annotating timeseries sections • Matplotlib used for ML diagnosis (last 2 slides) • Used Notebook widgets (ipywidgets) • Had to ask for a new feature (Notebook team responsive – thanks!) • If you wrote these from scratch you’d probably want to put aside several days

Next steps • The Random Forest model does pretty well
and we can explain why it works • Trained models generalise over time, we haven’t tested how well they generalise over patients • Next we’ll validate if we’re “good enough” to start work on auto-PEEP calcuations

Your experience? • Have you tackled timeseries event classification in
different ways? • Do you have experience with recurrent neural networks (or similar recurrent approach) that might work? • We’re very open to feedback and ideas!

Acknowledgments: • Professor Colin Morley for advice • Draeger Medical
for providing the ventilator data download tool • All the doctors and nurses of Cambridge Neonatal Intensive Care Unit • ModelInsight and Endava’s financial support

Contact • IanOzsvald.com - @IanOzsvald • Gusztav – github.com/Ventilator-Python •
Giles – linkedin.com/in/gilesweaver Previous talks: https://www.meetup.com/PyData-London-Meetup/events/236523797/ http://2016.pyconuk.org/talks/python-in-medicine-ventilator-data/

Machine learning with ventilator data to improv...

Machine learning with ventilator data to improve reporting on critically ill newborn infants

ianozsvald

More Decks by ianozsvald

Other Decks in Science

Featured

Transcript