Department of Epidemiology and Health Statistics, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
Hubei Provincial Center for Disease Control and Prevention, Wuhan, Hubei, China.
PLoS Negl Trop Dis. 2018 Feb 15;12(2):e0006262. doi: 10.1371/journal.pntd.0006262. eCollection 2018 Feb.
In order to better assist medical professionals, this study aimed to develop and compare the performance of three models-a multivariate logistic regression (LR) model, an artificial neural network (ANN) model, and a decision tree (DT) model-to predict the prognosis of patients with advanced schistosomiasis residing in the Hubei province.
METHODOLOGY/PRINCIPAL FINDINGS: Schistosomiasis surveillance data were collected from a previous study based on a Hubei population sample including 4136 advanced schistosomiasis cases. The predictive models use LR, ANN, and DT methods. From each of the three groups, 70% of the cases (2896 cases) were used as training data for the predictive models. The remaining 30% of the cases (1240 cases) were used as validation groups for performance comparisons between the three models. Prediction performance was evaluated using area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. Univariate analysis indicated that 16 risk factors were significantly associated with a patient's outcome of prognosis. In the training group, the mean AUC was 0.8276 for LR, 0.9267 for ANN, and 0.8229 for DT. In the validation group, the mean AUC was 0.8349 for LR, 0.8318 for ANN, and 0.8148 for DT. The three models yielded similar results in terms of accuracy, sensitivity, and specificity.
CONCLUSIONS/SIGNIFICANCE: Predictive models for advanced schistosomiasis prognosis, respectively using LR, ANN and DT models were proved to be effective approaches based on our dataset. The ANN model outperformed the LR and DT models in terms of AUC.
为了更好地帮助医学专业人员,本研究旨在开发和比较三种模型——多变量逻辑回归(LR)模型、人工神经网络(ANN)模型和决策树(DT)模型——以预测湖北省晚期血吸虫病患者的预后。
方法/主要发现:血吸虫病监测数据来自先前基于湖北省人群样本的一项研究,该研究包括 4136 例晚期血吸虫病病例。预测模型使用 LR、ANN 和 DT 方法。从三组中,70%的病例(2896 例)被用作预测模型的训练数据。其余 30%的病例(1240 例)被用作验证组,用于比较三种模型之间的性能。使用接受者操作特征曲线下的面积(AUC)、灵敏度、特异性和准确性来评估预测性能。单变量分析表明,16 个风险因素与患者的预后结果显著相关。在训练组中,LR 的平均 AUC 为 0.8276,ANN 为 0.9267,DT 为 0.8229。在验证组中,LR 的平均 AUC 为 0.8349,ANN 为 0.8318,DT 为 0.8148。这三种模型在准确性、灵敏度和特异性方面产生了相似的结果。
结论/意义:基于我们的数据集,分别使用 LR、ANN 和 DT 模型的晚期血吸虫病预后预测模型被证明是有效的方法。ANN 模型在 AUC 方面优于 LR 和 DT 模型。