Pharmaceutical Sciences and Health Technologies 3.
Bakos, Péter
Semmelweis University, Centre for Translational Medicine
Péter Bakos1, Bence Szabó1, Dávid Laczkó1, Caner Turan1, Shir Galin1, Péter Hegyi1, László Zubek1, András Lovas1, Zsolt Molnár1
1: Semmelweis University, Centre for Translational Medicine
Background
Optimal timing of extubation in mechanically ventilated patients remains a major challenge in intensive care. Machine learning (ML) models have been increasingly proposed to support clinical decision-making, yet their predictive performance and readiness for clinical application in extubation outcomes remain uncertain.
Aims
This study aimed to evaluate the predictive performance and clinical readiness of ML models for predicting extubation success.
Methods
A systematic search was performed in PubMed, Embase and CENTRAL up to November 5, 2024. Studies including mechanically ventilated critically ill adult patients undergoing planned extubation were eligible. The index test was any ML model predicting extubation outcome. Models reporting an area under the receiver operating characteristic curve (AUC) were meta-analyzed, and subgroup analyses were conducted. Risk of bias was assessed using a modified version of the Quality Assessment of Diagnostic Accuracy Studies-Comparative (QUADAS-C) tool, adapted for ML-based prediction studies.
Results
Twenty-six studies were included in the systematic review, and 47 ML models from 14 studies (n = 34,322 patients) were eligible for meta-analysis.
Reported AUC values across predictive models ranged from 0.59 to 0.98. Pooled AUCs by model type were 0.88 (95% CI: 0.78–0.94) for classical ML models and 0.85 (95% CI: 0.68–0.94) for deep learning models. The best-performing ML models of each study had a pooled AUC of 0.90 (95% CI: 0.82–0.95). Pooled estimates were derived from internally validated models, as only two studies reported externally validated AUCs with confidence intervals.
Study heterogeneity was high, driven by substantial differences in predictor selection and model design.
Conclusion
Machine learning models demonstrate acceptable discriminatory performance for predicting extubation success. However, limited external and prospective validation, substantial heterogeneity, and inconsistent reporting currently preclude their routine clinical implementation.
Funding:
This study was conducted within the framework of the Centre for Translational Medicine at Semmelweis University and was supported by the 2024-2.1.2-EKÖP-KDP-2024-00002 University Research Scholarship Program of the Ministry for Culture and Innovation from the source of the Hungarian National Research, Development and Innovation Fund.