Suppr超能文献

基于电子健康记录预测无症状颈动脉粥样硬化:六种机器学习模型的比较研究。

The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models.

机构信息

Department of Neurology, The Second Affiliated Hospital of Xi'an Jiaotong University, No. 157 West Five Road, Xi'an, 710004, Shaanxi, China.

Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, China.

出版信息

BMC Med Inform Decis Mak. 2021 Apr 5;21(1):115. doi: 10.1186/s12911-021-01480-3.

Abstract

BACKGROUND

Screening carotid B-mode ultrasonography is a frequently used method to detect subjects with carotid atherosclerosis (CAS). Due to the asymptomatic progression of most CAS patients, early identification is challenging for clinicians, and it may trigger ischemic stroke. Recently, machine learning has shown a strong ability to classify data and a potential for prediction in the medical field. The combined use of machine learning and the electronic health records of patients could provide clinicians with a more convenient and precise method to identify asymptomatic CAS.

METHODS

Retrospective cohort study using routine clinical data of medical check-up subjects from April 19, 2010 to November 15, 2019. Six machine learning models (logistic regression [LR], random forest [RF], decision tree [DT], eXtreme Gradient Boosting [XGB], Gaussian Naïve Bayes [GNB], and K-Nearest Neighbour [KNN]) were used to predict asymptomatic CAS and compared their predictability in terms of the area under the receiver operating characteristic curve (AUCROC), accuracy (ACC), and F1 score (F1).

RESULTS

Of the 18,441 subjects, 6553 were diagnosed with asymptomatic CAS. Compared to DT (AUCROC 0.628, ACC 65.4%, and F1 52.5%), the other five models improved prediction: KNN + 7.6% (0.704, 68.8%, and 50.9%, respectively), GNB + 12.5% (0.753, 67.0%, and 46.8%, respectively), XGB + 16.0% (0.788, 73.4%, and 55.7%, respectively), RF + 16.6% (0.794, 74.5%, and 56.8%, respectively) and LR + 18.1% (0.809, 74.7%, and 59.9%, respectively). The highest achieving model, LR predicted 1045/1966 cases (sensitivity 53.2%) and 3088/3566 non-cases (specificity 86.6%). A tenfold cross-validation scheme further verified the predictive ability of the LR.

CONCLUSIONS

Among machine learning models, LR showed optimal performance in predicting asymptomatic CAS. Our findings set the stage for an early automatic alarming system, allowing a more precise allocation of CAS prevention measures to individuals probably to benefit most.

摘要

背景

颈动脉 B 型超声检查是一种常用于检测颈动脉粥样硬化(CAS)患者的方法。由于大多数 CAS 患者无症状进展,临床医生很难早期识别,这可能会引发缺血性中风。最近,机器学习在医学领域表现出强大的分类数据和预测能力。将机器学习与患者的电子健康记录相结合,可为临床医生提供一种更方便、更精确的方法来识别无症状的 CAS。

方法

这是一项回顾性队列研究,使用了 2010 年 4 月 19 日至 2019 年 11 月 15 日期间医疗检查对象的常规临床数据。使用了六种机器学习模型(逻辑回归[LR]、随机森林[RF]、决策树[DT]、极端梯度提升[XGB]、高斯朴素贝叶斯[GNB]和 K 最近邻[KNN])来预测无症状的 CAS,并比较了它们在接收者操作特征曲线下面积(AUCROC)、准确性(ACC)和 F1 分数(F1)方面的预测能力。

结果

在 18441 名受试者中,有 6553 名被诊断为无症状 CAS。与 DT(AUCROC 0.628、ACC 65.4%和 F1 52.5%)相比,其他五个模型的预测能力有所提高:KNN+7.6%(0.704、68.8%和 50.9%)、GNB+12.5%(0.753、67.0%和 46.8%)、XGB+16.0%(0.788、73.4%和 55.7%)、RF+16.6%(0.794、74.5%和 56.8%)和 LR+18.1%(0.809、74.7%和 59.9%)。表现最好的模型是 LR,它预测了 1045/1966 例(敏感性 53.2%)和 3088/3566 例(特异性 86.6%)。十折交叉验证方案进一步验证了 LR 的预测能力。

结论

在机器学习模型中,LR 在预测无症状 CAS 方面表现最佳。我们的研究结果为早期自动报警系统奠定了基础,使我们能够更精确地对可能受益最大的个体分配 CAS 预防措施。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c676/8020544/de604c947652/12911_2021_1480_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验