用于构建未诊断糖尿病检测预测模型的机器学习算法比较——巴西老年人健康与生活方式纵向研究（ELSA-Brasil）：准确性研究

Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil: accuracy study.

作者信息

Olivera André Rodrigues, Roesler Valter, Iochpe Cirano, Schmidt Maria Inês, Vigo Álvaro, Barreto Sandhi Maria, Duncan Bruce Bartholow

机构信息

MSc. IT Analyst, Postgraduate Computing Program, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre (RS), Brazil.

PhD. Professor, Postgraduate Computing Program, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre (RS), Brazil.

出版信息

Sao Paulo Med J. 2017 May-Jun;135(3):234-246. doi: 10.1590/1516-3180.2016.0309010217.

DOI:10.1590/1516-3180.2016.0309010217

PMID:28746659

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10019841/

Abstract

CONTEXT AND OBJECTIVE

: Type 2 diabetes is a chronic disease associated with a wide range of serious health complications that have a major impact on overall health. The aims here were to develop and validate predictive models for detecting undiagnosed diabetes using data from the Longitudinal Study of Adult Health (ELSA-Brasil) and to compare the performance of different machine-learning algorithms in this task.

DESIGN AND SETTING

: Comparison of machine-learning algorithms to develop predictive models using data from ELSA-Brasil.

METHODS

: After selecting a subset of 27 candidate variables from the literature, models were built and validated in four sequential steps: (i) parameter tuning with tenfold cross-validation, repeated three times; (ii) automatic variable selection using forward selection, a wrapper strategy with four different machine-learning algorithms and tenfold cross-validation (repeated three times), to evaluate each subset of variables; (iii) error estimation of model parameters with tenfold cross-validation, repeated ten times; and (iv) generalization testing on an independent dataset. The models were created with the following machine-learning algorithms: logistic regression, artificial neural network, naïve Bayes, K-nearest neighbor and random forest.

RESULTS

: The best models were created using artificial neural networks and logistic regression. -These achieved mean areas under the curve of, respectively, 75.24% and 74.98% in the error estimation step and 74.17% and 74.41% in the generalization testing step.

CONCLUSION

: Most of the predictive models produced similar results, and demonstrated the feasibility of identifying individuals with highest probability of having undiagnosed diabetes, through easily-obtained clinical data.

摘要

背景与目的

2型糖尿病是一种与多种严重健康并发症相关的慢性疾病，对整体健康有重大影响。本文旨在利用成人健康纵向研究（ELSA - 巴西）的数据开发并验证用于检测未诊断糖尿病的预测模型，并比较不同机器学习算法在此任务中的性能。

设计与设置

使用ELSA - 巴西的数据比较机器学习算法以开发预测模型。

方法

从文献中选择27个候选变量的子集后，模型通过四个连续步骤构建和验证：（i）使用十折交叉验证进行参数调整，重复三次；（ii）使用前向选择进行自动变量选择，这是一种带有四种不同机器学习算法和十折交叉验证（重复三次）的包装策略，以评估每个变量子集；（iii）使用十折交叉验证估计模型参数的误差，重复十次；以及（iv）在独立数据集上进行泛化测试。模型使用以下机器学习算法创建：逻辑回归、人工神经网络、朴素贝叶斯、K近邻和随机森林。

结果

使用人工神经网络和逻辑回归创建了最佳模型。在误差估计步骤中，这些模型的曲线下平均面积分别为75.24%和74.98%，在泛化测试步骤中分别为74.17%和74.41%。

结论

大多数预测模型产生了相似的结果，并证明了通过容易获得的临床数据识别未诊断糖尿病可能性最高的个体的可行性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/106a/10019841/422e7404edab/1806-9460-spmj-135-03-00234-gt1.jpg

相似文献

Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil: accuracy study.

Sao Paulo Med J. 2017 May-Jun;135(3):234-246. doi: 10.1590/1516-3180.2016.0309010217.

Application of supervised machine learning algorithms for classification and prediction of type-2 diabetes disease status in Afar regional state, Northeastern Ethiopia 2021.

Sci Rep. 2023 May 13;13(1):7779. doi: 10.1038/s41598-023-34906-1.

Can Machine-learning Algorithms Predict Early Revision TKA in the Danish Knee Arthroplasty Registry?

Clin Orthop Relat Res. 2020 Sep;478(9):2088-2101. doi: 10.1097/CORR.0000000000001343.

Comparison of machine learning algorithms for the identification of acute exacerbations in chronic obstructive pulmonary disease.

Comput Methods Programs Biomed. 2020 May;188:105267. doi: 10.1016/j.cmpb.2019.105267. Epub 2019 Dec 9.

Clear Cell Renal Cell Carcinoma: Machine Learning-Based Quantitative Computed Tomography Texture Analysis for Prediction of Fuhrman Nuclear Grade.

Eur Radiol. 2019 Mar;29(3):1153-1163. doi: 10.1007/s00330-018-5698-2. Epub 2018 Aug 30.

Radiogenomics of lower-grade gliomas: machine learning-based MRI texture analysis for predicting 1p/19q codeletion status.

Eur Radiol. 2020 Feb;30(2):877-886. doi: 10.1007/s00330-019-06492-2. Epub 2019 Nov 5.

Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?

Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.

Machine learning outperformed logistic regression classification even with limit sample size: A model to predict pediatric HIV mortality and clinical progression to AIDS.

PLoS One. 2022 Oct 14;17(10):e0276116. doi: 10.1371/journal.pone.0276116. eCollection 2022.

Comparison and validation of injury risk classifiers for advanced automated crash notification systems.

Traffic Inj Prev. 2014;15 Suppl 1:S126-33. doi: 10.1080/15389588.2014.927577.

Early Diabetes Prediction: A Comparative Study Using Machine Learning Techniques.

Stud Health Technol Inform. 2022 Jun 29;295:409-413. doi: 10.3233/SHTI220752.

引用本文的文献

Prediction of metabolic syndrome and its associated risk factors in patients with chronic kidney disease using machine learning techniques.

J Bras Nefrol. 2024 Oct-Dec;46(4):e20230135. doi: 10.1590/2175-8239-JBN-2023-0135en.

Sex and population differences in the cardiometabolic continuum: a machine learning study using the UK Biobank and ELSA-Brasil cohorts.

BMC Public Health. 2024 Aug 6;24(1):2131. doi: 10.1186/s12889-024-19395-9.

Applicability of machine learning algorithm to predict the therapeutic intervention success in Brazilian smokers.

PLoS One. 2024 Mar 4;19(3):e0295970. doi: 10.1371/journal.pone.0295970. eCollection 2024.

Predicting asthma using imbalanced data modeling techniques: Evidence from 2019 Michigan BRFSS data.

PLoS One. 2023 Dec 7;18(12):e0295427. doi: 10.1371/journal.pone.0295427. eCollection 2023.

Detection of Diabetes through Microarray Genes with Enhancement of Classifiers Performance.

Diagnostics (Basel). 2023 Aug 11;13(16):2654. doi: 10.3390/diagnostics13162654.

Applicability of machine learning technique in the screening of patients with mild traumatic brain injury.

PLoS One. 2023 Aug 24;18(8):e0290721. doi: 10.1371/journal.pone.0290721. eCollection 2023.

Environmental exposures in machine learning and data mining approaches to diabetes etiology: A scoping review.

Artif Intell Med. 2023 Jan;135:102461. doi: 10.1016/j.artmed.2022.102461. Epub 2022 Nov 30.

Accuracy of Machine Learning Classification Models for the Prediction of Type 2 Diabetes Mellitus: A Systematic Survey and Meta-Analysis Approach.

Int J Environ Res Public Health. 2022 Nov 1;19(21):14280. doi: 10.3390/ijerph192114280.

Comparing machine learning algorithms for multimorbidity prediction: An example from the Elsa-Brasil study.

PLoS One. 2022 Oct 7;17(10):e0275619. doi: 10.1371/journal.pone.0275619. eCollection 2022.

Exploratory analysis using machine learning of predictive factors for falls in type 2 diabetes.

Sci Rep. 2022 Jul 13;12(1):11965. doi: 10.1038/s41598-022-15224-4.

本文引用的文献

Screening for prediabetes using machine learning models.

Comput Math Methods Med. 2014;2014:618976. doi: 10.1155/2014/618976. Epub 2014 Jul 16.

Applying decision tree for identification of a low risk population for type 2 diabetes. Tehran Lipid and Glucose Study.

Diabetes Res Clin Pract. 2014 Sep;105(3):391-8. doi: 10.1016/j.diabres.2014.07.003. Epub 2014 Jul 18.

Risk assessment tools for detecting those with pre-diabetes: a systematic review.

Diabetes Res Clin Pract. 2014 Jul;105(1):1-13. doi: 10.1016/j.diabres.2014.03.007. Epub 2014 Mar 18.

Predicting increased blood pressure using machine learning.

J Obes. 2014;2014:637635. doi: 10.1155/2014/637635. Epub 2014 Jan 23.

Global estimates of diabetes prevalence for 2013 and projections for 2035.

Diabetes Res Clin Pract. 2014 Feb;103(2):137-49. doi: 10.1016/j.diabres.2013.11.002. Epub 2013 Dec 1.

Prediction of fasting plasma glucose status using anthropometric measures for diagnosing type 2 diabetes.

IEEE J Biomed Health Inform. 2014 Mar;18(2):555-61. doi: 10.1109/JBHI.2013.2264509.

Cohort Profile: Longitudinal Study of Adult Health (ELSA-Brasil).

Int J Epidemiol. 2015 Feb;44(1):68-75. doi: 10.1093/ije/dyu027. Epub 2014 Feb 27.

Preventing type 2 diabetes mellitus: a call for personalized intervention.

Perm J. 2013 Summer;17(3):74-9. doi: 10.7812/TPP/12-143.

Global estimates of undiagnosed diabetes in adults.

Diabetes Res Clin Pract. 2014 Feb;103(2):150-60. doi: 10.1016/j.diabres.2013.11.001. Epub 2013 Dec 1.

Evaluating the risk of type 2 diabetes mellitus using artificial neural network: an effective classification approach.

Diabetes Res Clin Pract. 2013 Apr;100(1):111-8. doi: 10.1016/j.diabres.2013.01.023. Epub 2013 Feb 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于构建未诊断糖尿病检测预测模型的机器学习算法比较——巴西老年人健康与生活方式纵向研究（ELSA-Brasil）：准确性研究

Comparison of machine-learning algorithms to build a predictive model for detecting undiagnosed diabetes - ELSA-Brasil: accuracy study.

作者信息

机构信息

出版信息

CONTEXT AND OBJECTIVE

DESIGN AND SETTING

METHODS

RESULTS

CONCLUSION

背景与目的

设计与设置

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献