Slide 1

Slide 1 text

ML Model Drifting Spatial & Temporal Analysis Foto de Mike Hindle en Unsplash Andrés L. Martínez Ortiz @davilagrau

Slide 2

Slide 2 text

About me… Andrés-Leonardo Martínez-Ortiz a.k.a almo, holds a PhD on Software, Systems and Computing and a Master on Computer Science. Based on Zurich, almo is a member of the Google Machine Learning Site Reliability Engineering team, leading several programs aiming for reliability, efficiency & convergence. He is also a member of IEEE, ACM, Linux Foundation and Computer Society. @davilagrau almo

Slide 3

Slide 3 text

Agenda Machine Learning Operations: ● Efficiency, Reliability and convergence. ● Model Drift Spatial Drift Temporal Drift Risk Modelling References Photo by Bradyn Shock on Unsplash

Slide 4

Slide 4 text

Machine Learning Operations Photo by Philipp Katzenberger on Unsplash

Slide 5

Slide 5 text

Machine Learning Operations

Slide 6

Slide 6 text

MLOps: Model Drifting Machine Learning Abstract Model Data Drifting Concept Drifting Temporal Drifting Spatial Drifting Temporal Drifting

Slide 7

Slide 7 text

Spatial Drifting Photo by Pawel Czerwinski on Unsplash

Slide 8

Slide 8 text

Spatial Drift: Challenges and research areas ● Detection under unstructured and noise datasets ● Understanding of the model drift is required for a proper treatment. ● Reacting to model drift, adapting the life cycle. Photo by Harole Ethan on Unsplash

Slide 9

Slide 9 text

Detection General Framework Source: arXiv:2004.05785 Inferring data distribution Extracting sensitive features Severity of the drift Drift detection Accuracy

Slide 10

Slide 10 text

General drift partners and algorithms’ performance Source: arXiv:2004.05785 Source: Expert Systems with Applications 41 (2014) 8144–8156

Slide 11

Slide 11 text

Drift Understanding Time series analysis Synthetic data Degradation patterns datasets Explainable analysis (symbolic regression) For critical applications, detection is not enough. ML drift presents high dependency on the application, making difficult general solutions. Open dataset, synthetic data are opportunities for new developments. Massive Online Analysis (MOA)

Slide 12

Slide 12 text

Temporal Drift Photo by Jon Tyson on Unsplash

Slide 13

Slide 13 text

What is the temporal ml drifting? Temporal degradation of ml models affecting ● Penalized Regression ● Random Forest ● Gradient Boosting ● Neural network over ● long life datasets (3-5 years) ● with no data or concept drifting ● Multi-domain: weather, financial, hospital planning and flight delays. Foto de Dustin Humes en Unsplash Vela, D., Sharp, A., Zhang, R. et al. Temporal quality degradation in AI models. Sci Rep 12, 11654 (2022).

Slide 14

Slide 14 text

Evaluation framework Vela, D., Sharp, A., Zhang, R. et al. Temporal quality degradation in AI models. Sci Rep 12, 11654 (2022). You need to define your own evaluation framework.

Slide 15

Slide 15 text

How does the temporal ML drifting look like? No degradation or gradual Vela, D., Sharp, A., Zhang, R. et al. Temporal quality degradation in AI models. Sci Rep 12, 11654 (2022).

Slide 16

Slide 16 text

How does the temporal ML drifting look like? Explosive degradation Vela, D., Sharp, A., Zhang, R. et al. Temporal quality degradation in AI models. Sci Rep 12, 11654 (2022).

Slide 17

Slide 17 text

How does the temporal ML drifting look like? Increasing variability Vela, D., Sharp, A., Zhang, R. et al. Temporal quality degradation in AI models. Sci Rep 12, 11654 (2022).

Slide 18

Slide 18 text

How does the temporal ML drifting look like? Exotic patterns: chaos and periodic Vela, D., Sharp, A., Zhang, R. et al. Temporal quality degradation in AI models. Sci Rep 12, 11654 (2022).

Slide 19

Slide 19 text

Implications for ML Operations Long lasting models demands temporal degradation analysis. Numerical analysis and dynamic systems analysis are required. Automatic re-training is not (always) an option: ● Lack of clear thresholds ● Lack of training data ● Lack of convergence ● Catastrophic forgetting ● Random seeds dependencies Recommendations Extend drift analysis, including temporal drift. Evaluation should include models, hyperparameters and size of training data. Feature drifting analysis Real time or high frequency monitoring Your models life is now longer than ever. Temporal drift analysis is a must.

Slide 20

Slide 20 text

Risk Modelling and Mitigation Photo by Edge2Edge Media on Unsplash

Slide 21

Slide 21 text

Risks Vulnerabilities in complex systems (hardwares, software, data) Deceived Human Interaction: misleading reporting Unpredictable sequential planning (collusion) Difficulties being shut down Photo by Tom Morel on Unsplash Deployment of ML systems required risk analysis, including technological, business and legal perspectives

Slide 22

Slide 22 text

Model Evaluation along the life cycle Internal: multi-layer APIs: red teaming Auditing Evaluation requires proper development, documentation and deployment, adding extra complexity External

Slide 23

Slide 23 text

Model Evaluation Source: arXiv:2305.15324

Slide 24

Slide 24 text

Limitations and hazards Limitations Complex system integrations: unpredictable interactions Unknown unknown Hidden Features Over-promising Evaluation technology Hazards Impact of model evaluation Superficial improvements to model safety

Slide 25

Slide 25 text

Organizational Implications Communications ● Incident analysis and reporting, including external parties. ● Auditability ● Scientific peer-review ● Internal communication for business units and non technical staff. Security ● Intensive evaluation strategies ● AI-based monitoring ● Fast responses protocols ● Integrity verification, authorization and auditing Photo by Alexander Grey on Unsplash

Slide 26

Slide 26 text

Thank you! Questions?

Slide 27

Slide 27 text

References ● Lu J., Liu A., Dong F., Gu F., Gama J. and Zhang G. Learning under Concept Drift: A Review, arXiv:2004.05785 (2020). (link) ● Zeniseka, J., Holzingera, F. and Affenzellera, M. Machine learning based concept drift detection for predictive maintenance, Computers & Industrial Engineering 137 (2019) 106031. ● Gonçalves P. M., Carvalho Santos S.G.T., Barros, R.S.M. and Vieira D.C.L. A comparative study on concept drift detectors, Expert Systems with Applications 41 (2014) 8144–8156 ● Vela, D., Sharp, A., Zhang, R. et al. Temporal quality degradation in AI models. Sci Rep 12, 11654 (2022). (link) ● Shevlane T., Farquhar S., Garfinkel B., Phuong M., Whittlestone J., Leung J, Kokotajlo D., Marchal N., Anderljung M., Kolt N., Ho L., Siddarth D., Avin S., Hawkins W., Kim B., Gabriel I., Bolina V., Clark J., Bengio Y., Christiano P. and Dafoe A. Model evaluation for extreme risks, arXiv:2305.15324 (link)