Department of Rheumatology and Clinical Immunology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China.
School of Mathematical Sciences, Soochow University, Suzhou, China.
Math Biosci Eng. 2023 Jan;20(3):4574-4591. doi: 10.3934/mbe.2023212. Epub 2022 Dec 27.
Growing evidence shows that there is an increased risk of cardiovascular diseases among gout patients, especially coronary heart disease (CHD). Screening for CHD in gout patients based on simple clinical factors is still challenging. Here we aim to build a diagnostic model based on machine learning so as to avoid missed diagnoses or over exaggerated examinations as much as possible. Over 300 patient samples collected from Jiangxi Provincial People's Hospital were divided into two groups (gout and gout+CHD). The prediction of CHD in gout patients has thus been modeled as a binary classification problem. A total of eight clinical indicators were selected as features for machine learning classifiers. A combined sampling technique was used to overcome the imbalanced problem in the training dataset. Eight machine learning models were used including logistic regression, decision tree, ensemble learning models (random forest, XGBoost, LightGBM, GBDT), support vector machine (SVM) and neural networks. Our results showed that stepwise logistic regression and SVM achieved more excellent AUC values, while the random forest and XGBoost models achieved more excellent performances in terms of recall and accuracy. Furthermore, several high-risk factors were found to be effective indices in predicting CHD in gout patients, which provide insights into the clinical diagnosis.
越来越多的证据表明,痛风患者患心血管疾病的风险增加,尤其是冠心病(CHD)。基于简单的临床因素对痛风患者进行 CHD 筛查仍然具有挑战性。在这里,我们旨在建立一个基于机器学习的诊断模型,以尽量避免漏诊或过度夸大检查。我们从江西省人民医院收集了 300 多个患者样本,将其分为两组(痛风和痛风+CHD)。因此,将痛风患者的 CHD 预测建模为二分类问题。总共选择了 8 个临床指标作为机器学习分类器的特征。采用组合抽样技术克服了训练数据集的不平衡问题。共使用了 8 种机器学习模型,包括逻辑回归、决策树、集成学习模型(随机森林、XGBoost、LightGBM、GBDT)、支持向量机(SVM)和神经网络。结果表明,逐步逻辑回归和 SVM 获得了更高的 AUC 值,而随机森林和 XGBoost 模型在召回率和准确率方面表现更好。此外,还发现了一些高危因素是预测痛风患者 CHD 的有效指标,这为临床诊断提供了思路。