Fudan University School of Public Health, Building 8, 130 Dong'an Road, Shanghai 200032, China; Key Laboratory of Public Health Safety, Fudan University, Ministry of Education, Building 8, 130 Dong'an Road, Shanghai 200032, China; Fudan University Center for Tropical Disease Research, Building 8, 130 Dong'an Road, Shanghai 200032, China.
Hunan Institute for Schistosomiasis Control, Yueyang, Hunan Province, China.
Int J Parasitol. 2021 Oct;51(11):959-965. doi: 10.1016/j.ijpara.2021.03.004. Epub 2021 Apr 21.
Short-term prognosis of advanced schistosomiasis has not been well studied. We aimed to construct prognostic models using machine learning algorithms and to identify the most important predictors by utilising routinely available data under the government medical assistance programme. An established database of advanced schistosomiasis in Hunan, China was utilised for analysis. A total of 9541 patients for the period from January 2008 to December 2018 were enrolled in this study. Candidate predictors were selected from demographics, clinical features, medical examinations and test results. We applied five machine learning algorithms to construct 1 year prognostic models: logistic regression (LR), decision tree (DT), random forest (RF), artificial neural network (ANN) and extreme gradient boosting (XGBoost). An area under the receiver operating characteristic curve (AUC) was used to evaluate the model performance. The important predictors of the optimal model for unfavourable prognosis within 1 year were identified and ranked. There were 1249 (13.1%) cases having unfavourable prognoses within 1 year of discharge. The mean age of all participants was 61.94 years, of whom 70.9% were male. In general, XGBoost showed the best predictive performance with the highest AUC (0.846; 95% confidence interval (CI): 0.821, 0.871), compared with LR (0.798; 95% CI: 0.770, 0.827), DT (0.766; 95% CI: 0.733, 0.800), RF (0.823; 95% CI: 0.796, 0.851), and ANN (0.806; 95% CI: 0.778, 0.835). Five most important predictors identified by XGBoost were ascitic fluid volume, haemoglobin (HB), total bilirubin (TB), albumin (ALB), and platelets (PT). We proposed XGBoost as the best algorithm for the evaluation of a 1 year prognosis of advanced schistosomiasis. It is considered to be a simple and useful tool for the short-term prediction of an unfavourable prognosis for advanced schistosomiasis in clinical settings.
晚期血吸虫病的短期预后尚未得到很好的研究。我们旨在利用机器学习算法构建预后模型,并利用政府医疗援助计划下的常规数据来确定最重要的预测因素。本研究利用了中国湖南建立的晚期血吸虫病数据库。共纳入 2008 年 1 月至 2018 年 12 月期间的 9541 例患者。从人口统计学、临床特征、体格检查和检查结果中选择候选预测因素。我们应用五种机器学习算法构建了 1 年预后模型:逻辑回归(LR)、决策树(DT)、随机森林(RF)、人工神经网络(ANN)和极端梯度提升(XGBoost)。采用受试者工作特征曲线下面积(AUC)评估模型性能。确定并对最优模型中 1 年内不良预后的重要预测因素进行了排序。出院后 1 年内有 1249 例(13.1%)预后不良。所有参与者的平均年龄为 61.94 岁,其中 70.9%为男性。一般来说,XGBoost 显示出最佳的预测性能,AUC 最高(0.846;95%置信区间[CI]:0.821,0.871),优于 LR(0.798;95%CI:0.770,0.827)、DT(0.766;95%CI:0.733,0.800)、RF(0.823;95%CI:0.796,0.851)和 ANN(0.806;95%CI:0.778,0.835)。XGBoost 确定的五个最重要的预测因素是腹水体积、血红蛋白(HB)、总胆红素(TB)、白蛋白(ALB)和血小板(PT)。我们提出 XGBoost 是评估晚期血吸虫病 1 年预后的最佳算法。它被认为是一种简单而有用的工具,可用于临床环境中对晚期血吸虫病不良预后的短期预测。