Slide 1

Slide 1 text

Machine learning with ventilator data to improve reporting on critically ill newborn infants Gusztav Belteki, Giles Weaver, Ian Ozsvald PyDataLondon 2017

Slide 2

Slide 2 text

 Gusztav Belteki - Consultant Neonatologist in Cambridge, Interested in neonatal ventilation and data analysis with Python  Ian Ozsvald – Long-time Pythonista, PyDataLondon co- founder, ML consultant, author  Giles Weaver – Bioinformatician turned Python data scientist, Review Committee member  This collaboration builds on our PyData London January talk

Slide 3

Slide 3 text

Why do some newborn babies require mechanical ventilation ? • Prematurity: lung, muscles and brain are too immature to support adequate gas exchange • Full-term babies may require intensive care (e.g. infection, after an operation, birth depression etc.) • We have >1500 “ventilator days” yearly • Ventilation is also an important part of paediatric and adult intensive care

Slide 4

Slide 4 text

Ventilator in 1985 Ventilator in 2015

Slide 5

Slide 5 text

Two respiratory pumps • During spontaneous breathing ventilation is an ACTIVE process with the patient using negative pressures • Mechanical ventilation during general anaesthesia is a PASSIVE process with the ventilator using positive pressures • During neonatal intensive care patient do not receive full sedation or relaxation: ventilation is the combination and superimposition of these two pumps

Slide 6

Slide 6 text

“Clean ventilator breath” Patient-ventilator interaction: inspiration during expiratory phase

Slide 7

Slide 7 text

Patient not breathing Patient breathing

Slide 8

Slide 8 text

Data Collection • Perfomed a service evaluation of ventilation on the neonatal intensive care unit • Downloaded ~160 days of ventilator data from 59 ventilated neonates • Most recordings are >24 hours, usually 2-4 days • Time series data, sampling rate is 100 Hz (every 10 msec) • Data are retrieved as csv files • Generates approximately 650 Mbyte data / 24 hours of ventilation (1 ventilator day)

Slide 9

Slide 9 text

Flow (L/min) Pressure (mBar) Artificial lung Sedated patient Breathing patient Mechanical ventilation is always a complex physical process due to interaction between the ventilator and the patient 3 hours of ventilation (~1,000,000 data points)

Slide 10

Slide 10 text

Aims (clinical) • To provide the clinician with SIMPLE and QUANTITATIVE indicators ventilator-patient interactions However… …this requires looking at individual breaths in isolation …which would require ventilator data to be split into individual breaths… …that is not feasible to do manually on a longer trace

Slide 11

Slide 11 text

Aims (data science) • Automatically segment each breath (43,200 per night) – we have a working prototype • Summarisation of breathing statistics only possible if we’ve segmented them • Calculate “auto-PEEP” - a harmful condition for the baby • (future) Begin to classify patient-initiated or ventilator-initiated breaths and other ideas once we have segmentations in place

Slide 12

Slide 12 text

Summarising a set of breaths • If we can segment the breaths – what do we want to see? Ventilator delivers backup inflations if baby does not breathe for some time Breaths triggered by the baby are regularly spaced Different levels of ventilator contribution

Slide 13

Slide 13 text

Summarising a set of breaths • Having segmented breaths – we can show ‘what happened in a period of time’ - descriptive statistics of ventilator/patient interactions

Slide 14

Slide 14 text

Project phases • Get CSVs, check they’re sane • Fix data issues (timestamps #sigh) • Exploratory Data Analysis • Hand-building a Gold Standard for ML • Simple many-moving-averages “classifier” • Use of Random Forests and building up features for improved ML • Review with Dr. Belteki

Slide 15

Slide 15 text

Our technical approach • Notebooks, git, gitter • Module-ised Notebooks (see github)  Enables tool reuse • Thanking:

Slide 16

Slide 16 text

Annotating breaths by hand • Giles annotated 20 minutes of 4 traces using a custom Bokeh + ipywidgets tool

Slide 17

Slide 17 text

Random Forests and classifications • Used a 75% train & 25% test split per patient on 5 minutes of data (100Hz), 1 positive sample per 100 samples (approximately) • Built up a set of features that solved the problem reliably for most patients – rates of change and short-history indicators • Developed a GUI diagnostic tool

Slide 18

Slide 18 text

Correctly identified breaths • True positives have a consistent shape • High predict_proba at the correct location Heavily sedated patient, all work done by the ventilator

Slide 19

Slide 19 text

The interesting failures • False negatives (and positives) have some odd shapes, note the double predict_proba indications Conscious patient, they have strong breathing effort, ventilator has some contribution We see delayed triggering of the ventilator back-up breath in the blue example

Slide 20

Slide 20 text

The GUI Tools • Bokeh in Notebook very useful for annotating timeseries sections • Matplotlib used for ML diagnosis (last 2 slides) • Used Notebook widgets (ipywidgets) • Had to ask for a new feature (Notebook team responsive – thanks!) • If you wrote these from scratch you’d probably want to put aside several days

Slide 21

Slide 21 text

Next steps • The Random Forest model does pretty well and we can explain why it works • Trained models generalise over time, we haven’t tested how well they generalise over patients • Next we’ll validate if we’re “good enough” to start work on auto-PEEP calcuations

Slide 22

Slide 22 text

Your experience? • Have you tackled timeseries event classification in different ways? • Do you have experience with recurrent neural networks (or similar recurrent approach) that might work? • We’re very open to feedback and ideas!

Slide 23

Slide 23 text

Acknowledgments: • Professor Colin Morley for advice • Draeger Medical for providing the ventilator data download tool • All the doctors and nurses of Cambridge Neonatal Intensive Care Unit • ModelInsight and Endava’s financial support

Slide 24

Slide 24 text

Contact • IanOzsvald.com - @IanOzsvald • Gusztav – github.com/Ventilator-Python • Giles – linkedin.com/in/gilesweaver Previous talks: https://www.meetup.com/PyData-London-Meetup/events/236523797/ http://2016.pyconuk.org/talks/python-in-medicine-ventilator-data/