基于逻辑回归评分卡预测 2 型糖尿病发病风险。

Prediction of type 2 diabetes mellitus onset using logistic regression-based scorecards.

机构信息

Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel.

Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel.

出版信息

Elife. 2022 Jun 22;11:e71862. doi: 10.7554/eLife.71862.

DOI:10.7554/eLife.71862

PMID:35731045

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9255967/

Abstract

BACKGROUND

Type 2 diabetes (T2D) accounts for ~90% of all cases of diabetes, resulting in an estimated 6.7 million deaths in 2021, according to the International Diabetes Federation. Early detection of patients with high risk of developing T2D can reduce the incidence of the disease through a change in lifestyle, diet, or medication. Since populations of lower socio-demographic status are more susceptible to T2D and might have limited resources or access to sophisticated computational resources, there is a need for accurate yet accessible prediction models.

METHODS

In this study, we analyzed data from 44,709 nondiabetic UK Biobank participants aged 40-69, predicting the risk of T2D onset within a selected time frame (mean of 7.3 years with an SD of 2.3 years). We started with 798 features that we identified as potential predictors for T2D onset. We first analyzed the data using gradient boosting decision trees, survival analysis, and logistic regression methods. We devised one nonlaboratory model accessible to the general population and one more precise yet simple model that utilizes laboratory tests. We simplified both models to an accessible scorecard form, tested the models on normoglycemic and prediabetes subcohorts, and compared the results to the results of the general cohort. We established the nonlaboratory model using the following covariates: sex, age, weight, height, waist size, hip circumference, waist-to-hip ratio, and body mass index. For the laboratory model, we used age and sex together with four common blood tests: high-density lipoprotein (HDL), gamma-glutamyl transferase, glycated hemoglobin, and triglycerides. As an external validation dataset, we used the electronic medical record database of Clalit Health Services.

RESULTS

The nonlaboratory scorecard model achieved an area under the receiver operating curve (auROC) of 0.81 (95% confidence interval [CI] 0.77-0.84) and an odds ratio (OR) between the upper and fifth prevalence deciles of 17.2 (95% CI 5-66). Using this model, we classified three risk groups, a group with 1% (0.8-1%), 5% (3-6%), and the third group with a 9% (7-12%) risk of developing T2D. We further analyzed the contribution of the laboratory-based model and devised a blood test model based on age, sex, and the four common blood tests noted above. In this scorecard model, we included age, sex, glycated hemoglobin (HbA1c%), gamma glutamyl-transferase, triglycerides, and HDL cholesterol. Using this model, we achieved an auROC of 0.87 (95% CI 0.85-0.90) and a deciles' OR of ×48 (95% CI 12-109). Using this model, we classified the cohort into four risk groups with the following risks: 0.5% (0.4-7%); 3% (2-4%); 10% (8-12%); and a high-risk group of 23% (10-37%) of developing T2D. When applying the blood tests model using the external validation cohort (Clalit), we achieved an auROC of 0.75 (95% CI 0.74-0.75). We analyzed several additional comprehensive models, which included genotyping data and other environmental factors. We found that these models did not provide cost-efficient benefits over the four blood test model. The commonly used German Diabetes Risk Score (GDRS) and Finnish Diabetes Risk Score (FINDRISC) models, trained using our data, achieved an auROC of 0.73 (0.69-0.76) and 0.66 (0.62-0.70), respectively, inferior to the results achieved by the four blood test model and by the anthropometry models.

CONCLUSIONS

The four blood test and anthropometric models outperformed the commonly used nonlaboratory models, the FINDRISC and the GDRS. We suggest that our models be used as tools for decision-makers to assess populations at elevated T2D risk and thus improve medical strategies. These models might also provide a personal catalyst for changing lifestyle, diet, or medication modifications to lower the risk of T2D onset.

FUNDING

The funders had no role in study design, data collection, interpretation, or the decision to submit the work for publication.

摘要

背景

根据国际糖尿病联合会的数据，2 型糖尿病（T2D）约占所有糖尿病病例的 90%，导致 2021 年约有 670 万人死亡。通过改变生活方式、饮食或药物，早期发现有发生 T2D 风险的患者，可以降低疾病的发病率。由于社会经济地位较低的人群更容易患 T2D，而且可能资源有限或无法获得复杂的计算资源，因此需要准确且易于使用的预测模型。

方法

在这项研究中，我们分析了来自 44709 名年龄在 40-69 岁之间的非糖尿病 UK Biobank 参与者的数据，预测在选定时间段内（平均 7.3 年，标准差为 2.3 年）发生 T2D 发病的风险。我们首先从 798 个特征开始，这些特征被确定为 T2D 发病的潜在预测因子。我们使用梯度提升决策树、生存分析和逻辑回归方法分析了数据。我们设计了一个面向普通人群的非实验室模型和一个更精确但简单的模型，该模型利用实验室测试。我们将两个模型简化为易于使用的记分卡形式，在正常血糖和前期糖尿病亚队列中测试了模型，并将结果与普通队列的结果进行了比较。我们使用以下协变量建立了非实验室模型：性别、年龄、体重、身高、腰围、臀围、腰臀比和体重指数。对于实验室模型，我们使用年龄和性别以及四项常见血液测试：高密度脂蛋白（HDL）、γ-谷氨酰转移酶、糖化血红蛋白和甘油三酯。作为外部验证数据集，我们使用了 Clalit 健康服务的电子病历数据库。

结果

非实验室记分卡模型的受试者工作特征曲线下面积（auROC）为 0.81（95%置信区间 [CI] 0.77-0.84），上五分位数与第五五分位数之间的比值（OR）为 17.2（95%CI 5-66）。使用该模型，我们将风险人群分为三个风险组，一组的发病风险为 1%（0.8-1%），一组为 5%（3-6%），第三组的发病风险为 9%（7-12%）。我们进一步分析了实验室模型的贡献，并设计了一个基于年龄、性别和上述四项常见血液测试的血液测试模型。在这个记分卡模型中，我们纳入了年龄、性别、糖化血红蛋白（HbA1c%）、γ-谷氨酰转移酶、甘油三酯和高密度脂蛋白胆固醇。使用这个模型，我们得到了 auROC 为 0.87（95%CI 0.85-0.90）和十等分位数 OR 为 48（95%CI 12-109）。使用这个模型，我们将队列分为四个风险组，其风险分别为：0.5%（0.4-7%）；3%（2-4%）；10%（8-12%）；高风险组为 23%（10-37%）。当我们在外部验证队列（Clalit）中使用血液测试模型时，我们得到了 auROC 为 0.75（95%CI 0.74-0.75）。我们分析了几个额外的综合模型，这些模型包括基因分型数据和其他环境因素。我们发现，与四项血液测试模型相比，这些模型并没有带来成本效益的优势。常用的德国糖尿病风险评分（GDRS）和芬兰糖尿病风险评分（FINDRISC）模型，使用我们的数据进行训练，得到的 auROC 分别为 0.73（0.69-0.76）和 0.66（0.62-0.70），低于四项血液测试模型和人体测量模型的结果。

结论

四项血液测试和人体测量模型优于常用的非实验室模型，包括 FINDRISC 和 GDRS。我们建议将我们的模型用作决策者评估高 T2D 风险人群的工具，从而改善医疗策略。这些模型也可能为改变生活方式、饮食或药物治疗以降低 T2D 发病风险提供个人动力。

资助

资助者在研究设计、数据收集、解释或提交工作以供发表方面没有作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a29/9255967/609058264571/elife-71862-fig1.jpg

相似文献

Prediction of type 2 diabetes mellitus onset using logistic regression-based scorecards.基于逻辑回归评分卡预测 2 型糖尿病发病风险。

Elife. 2022 Jun 22;11:e71862. doi: 10.7554/eLife.71862.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Prediction of Type 2 Diabetes by Hemoglobin A in Two Community-Based Cohorts.基于两个社区队列的血红蛋白 A 预测 2 型糖尿病。

Diabetes Care. 2018 Jan;41(1):60-68. doi: 10.2337/dc17-0607. Epub 2017 Oct 26.

A Prediction Model Based on Noninvasive Indicators to Predict the 8-Year Incidence of Type 2 Diabetes in Patients with Nonalcoholic Fatty Liver Disease: A Population-Based Retrospective Cohort Study.基于非侵入性指标预测非酒精性脂肪性肝病患者 8 年内 2 型糖尿病发病风险的预测模型：一项基于人群的回顾性队列研究。

Biomed Res Int. 2021 May 14;2021:5527460. doi: 10.1155/2021/5527460. eCollection 2021.

Prognostic factors for the development and progression of proliferative diabetic retinopathy in people with diabetic retinopathy.增生性糖尿病性视网膜病变在糖尿病性视网膜病变患者中发展和进展的预测因素。

Cochrane Database Syst Rev. 2023 Feb 22;2(2):CD013775. doi: 10.1002/14651858.CD013775.pub2.

Neck Circumference and its Correlation to Other Anthropometric Parameters and Finnish Diabetes Risk Score (FINDRISC).颈围及其与其他人体测量参数和芬兰糖尿病风险评分（FINDRISC）的相关性。

Curr Diabetes Rev. 2018;14(5):464-471. doi: 10.2174/1573399813666171002113442.

[Establishing a noninvasive prediction model for type 2 diabetes mellitus based on a rural Chinese population].基于中国农村人群建立2型糖尿病的无创预测模型

Zhonghua Yu Fang Yi Xue Za Zhi. 2016 May;50(5):397-403. doi: 10.3760/cma.j.issn.0253-9624.2016.05.003.

Development and validation of a lifetime prediction model for incident type 2 diabetes in patients with established cardiovascular disease: the CVD2DM model.已确诊心血管疾病患者2型糖尿病发病终生预测模型的开发与验证：CVD2DM模型

Eur J Prev Cardiol. 2024 Oct 10;31(14):1671-1678. doi: 10.1093/eurjpc/zwae096.

Association between the ratio of triglyceride to high-density lipoprotein cholesterol and incident type 2 diabetes in Singapore Chinese men and women.新加坡华裔男性和女性中甘油三酯与高密度脂蛋白胆固醇比值与2型糖尿病发病之间的关联。

J Diabetes. 2017 Jul;9(7):689-698. doi: 10.1111/1753-0407.12477. Epub 2016 Oct 7.

Predictive value of circulating NMR metabolic biomarkers for type 2 diabetes risk in the UK Biobank study.循环 NMR 代谢生物标志物对英国生物库研究中 2 型糖尿病风险的预测价值。

BMC Med. 2022 May 3;20(1):159. doi: 10.1186/s12916-022-02354-9.

引用本文的文献

Applications of Artificial Intelligence and Machine Learning in Prediabetes: A Scoping Review.人工智能和机器学习在糖尿病前期的应用：一项范围综述

J Diabetes Sci Technol. 2025 Jul 8:19322968251351995. doi: 10.1177/19322968251351995.

A hybrid approach to enhance HbA1c prediction accuracy while minimizing the number of associated predictors: A case-control study in Saudi Arabia.一种在最小化相关预测因子数量的同时提高糖化血红蛋白（HbA1c）预测准确性的混合方法：沙特阿拉伯的一项病例对照研究。

PLoS One. 2025 Jun 17;20(6):e0326315. doi: 10.1371/journal.pone.0326315. eCollection 2025.

Predicting Type 2 diabetes onset age using machine learning: A case study in KSA.使用机器学习预测2型糖尿病发病年龄：沙特阿拉伯的一个案例研究。

PLoS One. 2025 Feb 11;20(2):e0318484. doi: 10.1371/journal.pone.0318484. eCollection 2025.

Robust predictive framework for diabetes classification using optimized machine learning on imbalanced datasets.使用优化的机器学习方法对不平衡数据集进行糖尿病分类的稳健预测框架。

Front Artif Intell. 2025 Jan 7;7:1499530. doi: 10.3389/frai.2024.1499530. eCollection 2024.

Genetics-driven risk predictions leveraging the Mendelian randomization framework.基于孟德尔随机化框架的遗传驱动风险预测。

Genome Res. 2024 Oct 11;34(9):1276-1285. doi: 10.1101/gr.279252.124.

Plasma infrared fingerprinting with machine learning enables single-measurement multi-phenotype health screening.结合机器学习的血浆红外指纹图谱技术可实现单次测量的多表型健康筛查。

Cell Rep Med. 2024 Jul 16;5(7):101625. doi: 10.1016/j.xcrm.2024.101625. Epub 2024 Jun 28.

Analysis of risk factors and clinical implications for diabetes in first-degree relatives in the northeastern region of China.东北地区一级亲属糖尿病发病风险因素分析及临床意义。

Front Endocrinol (Lausanne). 2024 Jun 11;15:1385583. doi: 10.3389/fendo.2024.1385583. eCollection 2024.

Carpal Tunnel Syndrome and Trigger Finger May Be an Early Symptom of Preclinic Type 2 Diabetes.腕管综合征和扳机指可能是临床前期2型糖尿病的早期症状。

Plast Reconstr Surg Glob Open. 2024 Jun 14;12(6):e5907. doi: 10.1097/GOX.0000000000005907. eCollection 2024 Jun.

Proteomic Analyses in Diverse Populations Improved Risk Prediction and Identified New Drug Targets for Type 2 Diabetes.多人群蛋白质组学分析提高了 2 型糖尿病风险预测能力并发现了新的药物靶点。

Diabetes Care. 2024 Jun 1;47(6):1012-1019. doi: 10.2337/dc23-2145.

Risk of diabetes and expected years in life without diabetes among adults from an urban community in India: findings from a retrospective cohort.印度城市社区成年人的糖尿病风险和无糖尿病预期寿命：一项回顾性队列研究的结果。

BMC Public Health. 2024 Apr 15;24(1):1048. doi: 10.1186/s12889-024-18465-2.

本文引用的文献

Machine learning for prediction of diabetes risk in middle-aged Swedish people.用于预测瑞典中年人群糖尿病风险的机器学习

Heliyon. 2021 Jun 25;7(7):e07419. doi: 10.1016/j.heliyon.2021.e07419. eCollection 2021 Jul.

From Local Explanations to Global Understanding with Explainable AI for Trees.利用可解释人工智能实现从局部解释到树木的全局理解

Nat Mach Intell. 2020 Jan;2(1):56-67. doi: 10.1038/s42256-019-0138-9. Epub 2020 Jan 17.

Prediction of gestational diabetes based on nationwide electronic health records.基于全国电子健康记录预测妊娠期糖尿病。

Nat Med. 2020 Jan;26(1):71-76. doi: 10.1038/s41591-019-0724-8. Epub 2020 Jan 13.

The global epidemics of diabetes in the 21st century: Current situation and perspectives.21 世纪全球糖尿病流行：现状与展望。

Eur J Prev Cardiol. 2019 Dec;26(2_suppl):7-14. doi: 10.1177/2047487319881021.

The UK Biobank resource with deep phenotyping and genomic data.英国生物银行资源库，具有深度表型和基因组数据。

Nature. 2018 Oct;562(7726):203-209. doi: 10.1038/s41586-018-0579-z. Epub 2018 Oct 10.

Diagnostic accuracy of the Finnish Diabetes Risk Score (FINDRISC) for undiagnosed T2DM in Peruvian population.芬兰糖尿病风险评分（FINDRISC）对秘鲁人群未诊断2型糖尿病的诊断准确性。

Prim Care Diabetes. 2018 Dec;12(6):517-525. doi: 10.1016/j.pcd.2018.07.015. Epub 2018 Aug 18.

Access to pathology and laboratory medicine services: a crucial gap.获取病理学和实验室医学服务的机会：一个关键的差距。

Lancet. 2018 May 12;391(10133):1927-1938. doi: 10.1016/S0140-6736(18)30458-6. Epub 2018 Mar 15.

Predicting type 2 diabetes mellitus: a comparison between the FINDRISC score and the metabolic syndrome.预测2型糖尿病：芬兰糖尿病风险评分（FINDRISC）与代谢综合征的比较

Diabetol Metab Syndr. 2018 Mar 1;10:12. doi: 10.1186/s13098-018-0310-0. eCollection 2018.

HAPT2D: high accuracy of prediction of T2D with a model combining basic and advanced data depending on availability.HAPT2D：基于可用性，通过结合基本和高级数据的模型来实现 T2D 预测的高精度。

Eur J Endocrinol. 2018 Apr;178(4):331-341. doi: 10.1530/EJE-17-0921. Epub 2018 Jan 25.

Impact of common genetic determinants of Hemoglobin A1c on type 2 diabetes risk and diagnosis in ancestrally diverse populations: A transethnic genome-wide meta-analysis.糖化血红蛋白常见遗传决定因素对不同种族人群2型糖尿病风险及诊断的影响：一项跨种族全基因组荟萃分析。

PLoS Med. 2017 Sep 12;14(9):e1002383. doi: 10.1371/journal.pmed.1002383. eCollection 2017 Sep.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于逻辑回归评分卡预测 2 型糖尿病发病风险。

Prediction of type 2 diabetes mellitus onset using logistic regression-based scorecards.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

FUNDING

背景

方法

结果

结论

资助

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献