Sullivan Brynne A, Moreira Alvaro G, McAdams Ryan M, Knake Lindsey A, Husain Ameena, Qiu Jiaxing, Mudireddy Avinash, Majeedi Abrar, Shalish Wissam, Lake Douglas E, Vesoulis Zachary A
University of Virginia, Department of Pediatrics, Division of Neonatology, Charlottesville, VA, USA.
University of Texas Health San Antonio, Department of Pediatrics, Division of Neonatology, San Antonio, TX, USA.
Pediatr Res. 2024 Dec 16. doi: 10.1038/s41390-024-03773-5.
Predicting mortality risk in neonatal intensive care units (NICUs) is challenging due to complex, variable clinical and physiological data. Machine learning (ML) offers potential for more accurate risk stratification.
To compare the performance of various ML models in predicting NICU mortality using a team-based modeling competition.
We conducted a modeling competition with five neonatologist-led teams applying ML techniques-logistic regression, CatBoost, neural networks, random forest, and XGBoost-to a shared dataset from over 6,000 NICU admissions. The dataset included static demographic and clinical variables, alongside daily samples of heart rate and oxygen saturation. Each team developed models to predict mortality risk at baseline and within 7 days. Models were evaluated using the area under the receiver operator characteristic curve (AUC). Results were presented at a national meeting, where an audience poll ranked models before AUC results were revealed.
The audience favored the most complex model (CNN) for real-world application, though logistic regression achieved the highest AUC on test data. Teams employed varied feature selection, tuning, and evaluation strategies.
Logistic regression outperformed more complex models, highlighting the importance of selecting modeling methods based on data characteristics, interpretability, and expertise rather than model complexity alone.
By demonstrating that model complexity does not necessarily equate to better predictive performance, this research encourages the careful selection of modeling approaches.
由于临床和生理数据复杂多变,预测新生儿重症监护病房(NICU)的死亡风险具有挑战性。机器学习(ML)为更准确的风险分层提供了潜力。
通过基于团队的建模竞赛,比较各种ML模型在预测NICU死亡率方面的性能。
我们举办了一场建模竞赛,五个由新生儿科医生领导的团队将ML技术——逻辑回归、CatBoost、神经网络、随机森林和XGBoost——应用于来自6000多名NICU入院患者的共享数据集。该数据集包括静态人口统计学和临床变量,以及心率和血氧饱和度的每日样本。每个团队开发模型以预测基线时和7天内的死亡风险。使用受试者工作特征曲线下面积(AUC)评估模型。结果在一次全国性会议上公布,在AUC结果公布之前,由观众投票对模型进行排名。
观众青睐最复杂的模型(CNN)用于实际应用,尽管逻辑回归在测试数据上的AUC最高。各团队采用了不同的特征选择、调整和评估策略。
逻辑回归优于更复杂的模型,突出了根据数据特征、可解释性和专业知识而非仅根据模型复杂性来选择建模方法的重要性。
通过证明模型复杂性不一定等同于更好的预测性能,本研究鼓励谨慎选择建模方法。