Qian Rui, Zhuang Jiamei, Xie Jianjun, Cheng Honghui, Ou Haiya, Lu Xiang, Ouyang Zichen
Department of Gastroenterology, Shenzhen Bao'an Chinese Medicine Hospital, Guangzhou University of Chinese Medicine, Shenzhen 518000, China.
The Fourth Clinical Medical College of Guangzhou University of Chinese Medicine, Shenzhen, 518033, China.
Heliyon. 2024 Apr 15;10(8):e29603. doi: 10.1016/j.heliyon.2024.e29603. eCollection 2024 Apr 30.
Predicting the severity of acute pancreatitis (AP) early poses a challenge in clinical practice. While there are well-established clinical scoring tools, their actual predictive performance remains uncertain. Various studies have explored the application of machine-learning methods for early AP prediction. However, a more comprehensive evidence-based assessment is needed to determine their predictive accuracy. Hence, this systematic review and meta-analysis aimed to evaluate the predictive accuracy of machine learning in assessing the severity of AP.
PubMed, EMBASE, Cochrane Library, and Web of Science were systematically searched until December 5, 2023. The risk of bias in eligible studies was assessed using the Prediction Model Risk of Bias Assessment Tool (PROBAST). Subgroup analyses, based on different machine learning types, were performed. Additionally, the predictive accuracy of mainstream scoring tools was summarized.
This systematic review ultimately included 33 original studies. The pooled c-index in both the training and validation sets was 0.87 (95 % CI: 0.84-0.89) and 0.88 (95 % CI: 0.86-0.90), respectively. The sensitivity in the training set was 0.81 (95 % CI: 0.77-0.84), and in the validation set, it was 0.79 (95 % CI: 0.71-0.85). The specificity in the training set was 0.84 (95 % CI: 0.78-0.89), and in the validation set, it was 0.90 (95 % CI: 0.86-0.93). The primary model incorporated was logistic regression; however, its predictive accuracy was found to be inferior to that of neural networks, random forests, and xgboost. The pooled c-index of the APACHE II, BISAP, and Ranson were 0.74 (95 % CI: 0.68-0.80), 0.77 (95 % CI: 0.70-0.85), and 0.74 (95 % CI: 0.68-0.79), respectively.
Machine learning demonstrates excellent accuracy in predicting the severity of AP, providing a reference for updating or developing a straightforward clinical prediction tool.
在临床实践中,早期预测急性胰腺炎(AP)的严重程度是一项挑战。虽然有成熟的临床评分工具,但其实际预测性能仍不确定。各种研究探索了机器学习方法在早期AP预测中的应用。然而,需要更全面的循证评估来确定其预测准确性。因此,本系统评价和荟萃分析旨在评估机器学习在评估AP严重程度方面的预测准确性。
系统检索了PubMed、EMBASE、Cochrane图书馆和Web of Science,检索截至2023年12月5日。使用预测模型偏倚风险评估工具(PROBAST)评估纳入研究的偏倚风险。基于不同的机器学习类型进行亚组分析。此外,总结了主流评分工具的预测准确性。
本系统评价最终纳入33项原始研究。训练集和验证集的合并c指数分别为0.87(95%CI:0.84-0.89)和0.88(95%CI:0.86-0.90)。训练集的敏感性为0.81(95%CI:0.77-0.84),验证集的敏感性为0.79(95%CI:0.71-0.85)。训练集的特异性为0.84(95%CI:0.78-0.89),验证集的特异性为0.90(95%CI:0.86-0.93)。纳入的主要模型是逻辑回归;然而,发现其预测准确性低于神经网络、随机森林和xgboost。APACHE II、BISAP和Ranson的合并c指数分别为0.74(95%CI:0.68-0.80)、0.77(95%CI:0.70-0.85)和0.74(95%CI:0.68-0.79)。
机器学习在预测AP严重程度方面表现出优异的准确性,为更新或开发简单的临床预测工具提供了参考。