• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用系统机器学习框架在中国大规模人群中识别潜在的 2 型糖尿病。

Identification of Potential Type II Diabetes in a Large-Scale Chinese Population Using a Systematic Machine Learning Framework.

机构信息

Hospital of Traditional Chinese Medicine Affiliated to the Fourth Clinical Medical College of Xinjiang Medical University, Urumqi, China.

College of Public Health, Xinjiang Medical University, Urumqi, China.

出版信息

J Diabetes Res. 2020 Sep 24;2020:6873891. doi: 10.1155/2020/6873891. eCollection 2020.

DOI:10.1155/2020/6873891
PMID:33029536
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7532405/
Abstract

BACKGROUND

An estimated 425 million people globally have diabetes, accounting for 12% of the world's health expenditures, and the number continues to grow, placing a huge burden on the healthcare system, especially in those remote, underserved areas.

METHODS

A total of 584,168 adult subjects who have participated in the national physical examination were enrolled in this study. The risk factors for type II diabetes mellitus (T2DM) were identified by values and odds ratio, using logistic regression (LR) based on variables of physical measurement and a questionnaire. Combined with the risk factors selected by LR, we used a decision tree, a random forest, AdaBoost with a decision tree (AdaBoost), and an extreme gradient boosting decision tree (XGBoost) to identify individuals with T2DM, compared the performance of the four machine learning classifiers, and used the best-performing classifier to output the degree of variables' importance scores of T2DM.

RESULTS

The results indicated that XGBoost had the best performance (accuracy = 0.906, precision = 0.910, recall = 0.902, -1 = 0.906, and AUC = 0.968). The degree of variables' importance scores in XGBoost showed that BMI was the most significant feature, followed by age, waist circumference, systolic pressure, ethnicity, smoking amount, fatty liver, hypertension, physical activity, drinking status, dietary ratio (meat to vegetables), drink amount, smoking status, and diet habit (oil loving).

CONCLUSIONS

We proposed a classifier based on LR-XGBoost which used fourteen variables of patients which are easily obtained and noninvasive as predictor variables to identify potential incidents of T2DM. The classifier can accurately screen the risk of diabetes in the early phrase, and the degree of variables' importance scores gives a clue to prevent diabetes occurrence.

摘要

背景

全球约有 4.25 亿人患有糖尿病,占全球卫生支出的 12%,且这一数字还在不断增加,这给医疗系统带来了巨大负担,尤其是在那些偏远、服务不足的地区。

方法

本研究共纳入 584168 名参加全国体检的成年受试者。采用基于体格测量和问卷调查的变量的逻辑回归(LR)确定 2 型糖尿病(T2DM)的危险因素。结合 LR 选择的危险因素,采用决策树、随机森林、基于决策树的 AdaBoost(AdaBoost)和极端梯度提升决策树(XGBoost)识别 T2DM 患者,比较四种机器学习分类器的性能,并使用性能最佳的分类器输出 T2DM 变量重要性得分。

结果

结果表明,XGBoost 的性能最佳(准确率=0.906、精确率=0.910、召回率=0.902、F1 值=0.906 和 AUC=0.968)。XGBoost 中变量重要性得分显示,BMI 是最重要的特征,其次是年龄、腰围、收缩压、种族、吸烟量、脂肪肝、高血压、体力活动、饮酒状态、饮食比例(肉与蔬菜)、饮酒量、吸烟状况和饮食习惯(爱吃油)。

结论

我们提出了一种基于 LR-XGBoost 的分类器,该分类器使用患者的 14 个易于获得且非侵入性的变量作为预测变量来识别 T2DM 的潜在事件。该分类器可以准确地筛查早期糖尿病的风险,变量重要性得分可以提供预防糖尿病发生的线索。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c46f/7532405/070d588c9008/JDR2020-6873891.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c46f/7532405/bca4117a46ad/JDR2020-6873891.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c46f/7532405/8ba742d2fc95/JDR2020-6873891.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c46f/7532405/f418f6cd8906/JDR2020-6873891.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c46f/7532405/070d588c9008/JDR2020-6873891.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c46f/7532405/bca4117a46ad/JDR2020-6873891.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c46f/7532405/8ba742d2fc95/JDR2020-6873891.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c46f/7532405/f418f6cd8906/JDR2020-6873891.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c46f/7532405/070d588c9008/JDR2020-6873891.004.jpg

相似文献

1
Identification of Potential Type II Diabetes in a Large-Scale Chinese Population Using a Systematic Machine Learning Framework.利用系统机器学习框架在中国大规模人群中识别潜在的 2 型糖尿病。
J Diabetes Res. 2020 Sep 24;2020:6873891. doi: 10.1155/2020/6873891. eCollection 2020.
2
Machine learning for characterizing risk of type 2 diabetes mellitus in a rural Chinese population: the Henan Rural Cohort Study.基于中国农村人群的机器学习特征分析 2 型糖尿病风险:河南农村队列研究。
Sci Rep. 2020 Mar 10;10(1):4406. doi: 10.1038/s41598-020-61123-x.
3
A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population.基于机器学习的大型人群中非酒精性脂肪肝识别和分类框架。
Front Public Health. 2022 Apr 4;10:846118. doi: 10.3389/fpubh.2022.846118. eCollection 2022.
4
[Establishing a noninvasive prediction model for type 2 diabetes mellitus based on a rural Chinese population].基于中国农村人群建立2型糖尿病的无创预测模型
Zhonghua Yu Fang Yi Xue Za Zhi. 2016 May;50(5):397-403. doi: 10.3760/cma.j.issn.0253-9624.2016.05.003.
5
Comparing the accuracy of four machine learning models in predicting type 2 diabetes onset within the Chinese population: a retrospective study.比较四种机器学习模型预测中国人群2型糖尿病发病的准确性:一项回顾性研究。
J Int Med Res. 2024 Jun;52(6):3000605241253786. doi: 10.1177/03000605241253786.
6
Identification of Potential Type II Diabetes in a Chinese Population with a Sensitive Decision Tree Approach.基于敏感决策树方法的中国人群 2 型糖尿病潜在性识别。
J Diabetes Res. 2019 Jan 22;2019:4248218. doi: 10.1155/2019/4248218. eCollection 2019.
7
A machine learning-based diagnosis modelling of type 2 diabetes mellitus with environmental metal exposure.基于机器学习的环境金属暴露与 2 型糖尿病的诊断模型研究。
Comput Methods Programs Biomed. 2023 Jun;235:107537. doi: 10.1016/j.cmpb.2023.107537. Epub 2023 Apr 5.
8
Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques.使用机器学习技术预测中国老年人患2型糖尿病的风险
J Pers Med. 2022 May 31;12(6):905. doi: 10.3390/jpm12060905.
9
A machine learning-based framework to identify type 2 diabetes through electronic health records.一种基于机器学习的通过电子健康记录识别2型糖尿病的框架。
Int J Med Inform. 2017 Jan;97:120-127. doi: 10.1016/j.ijmedinf.2016.09.014. Epub 2016 Oct 1.
10
Application of Machine Learning to Identify Clustering of Cardiometabolic Risk Factors in U.S. Adults.机器学习在美国成年人中心血管代谢风险因素聚类识别中的应用。
Diabetes Technol Ther. 2019 May;21(5):245-253. doi: 10.1089/dia.2018.0390. Epub 2019 Apr 10.

引用本文的文献

1
Development of a 5-Year Risk Prediction Model for Transition From Prediabetes to Diabetes Using Machine Learning: Retrospective Cohort Study.使用机器学习开发一个用于预测糖尿病前期转变为糖尿病的5年风险预测模型:回顾性队列研究。
J Med Internet Res. 2025 May 9;27:e73190. doi: 10.2196/73190.
2
Automated sample annotation for diabetes mellitus in healthcare integrated biobanking.医疗综合生物样本库中糖尿病的自动样本注释
Comput Struct Biotechnol J. 2024 Oct 23;24:724-733. doi: 10.1016/j.csbj.2024.10.033. eCollection 2024 Dec.
3
Development and Validation of Machine Learning Models for Identifying Prediabetes and Diabetes in Normoglycemia.

本文引用的文献

1
A simple nomogram score for screening patients with type 2 diabetes to detect those with hypertension: A cross-sectional study based on a large community survey in China.一个用于筛查 2 型糖尿病患者中高血压患者的简单列线图评分:基于中国大型社区调查的横断面研究。
PLoS One. 2020 Aug 7;15(8):e0236957. doi: 10.1371/journal.pone.0236957. eCollection 2020.
2
A nomogram model for screening the risk of diabetes in a large-scale Chinese population: an observational study from 345,718 participants.用于大规模中国人群中糖尿病风险筛查的列线图模型:一项来自 345718 名参与者的观察性研究。
Sci Rep. 2020 Jul 14;10(1):11600. doi: 10.1038/s41598-020-68383-7.
3
开发和验证用于识别正常血糖中的糖尿病前期和糖尿病的机器学习模型。
Diabetes Metab Res Rev. 2024 Nov;40(8):e70003. doi: 10.1002/dmrr.70003.
4
Predicting three-month fasting blood glucose and glycated hemoglobin changes in patients with type 2 diabetes mellitus based on multiple machine learning algorithms.基于多种机器学习算法预测 2 型糖尿病患者三个月的空腹血糖和糖化血红蛋白变化。
Sci Rep. 2023 Sep 30;13(1):16437. doi: 10.1038/s41598-023-43240-5.
5
Environmental exposures in machine learning and data mining approaches to diabetes etiology: A scoping review.机器学习和数据挖掘方法在糖尿病病因学中的环境暴露:范围综述。
Artif Intell Med. 2023 Jan;135:102461. doi: 10.1016/j.artmed.2022.102461. Epub 2022 Nov 30.
6
Application of machine learning algorithms in predicting HIV infection among men who have sex with men: Model development and validation.机器学习算法在预测男男性行为者中 HIV 感染中的应用:模型开发和验证。
Front Public Health. 2022 Aug 25;10:967681. doi: 10.3389/fpubh.2022.967681. eCollection 2022.
7
Design of predictive model to optimize the solubility of Oxaprozin as nonsteroidal anti-inflammatory drug.设计预测模型以优化非甾体抗炎药奥沙普秦的溶解度。
Sci Rep. 2022 Jul 30;12(1):13106. doi: 10.1038/s41598-022-17350-5.
8
Predicting the 2-Year Risk of Progression from Prediabetes to Diabetes Using Machine Learning among Chinese Elderly Adults.利用机器学习预测中国老年人群中糖尿病前期进展为糖尿病的2年风险
J Pers Med. 2022 Jun 27;12(7):1055. doi: 10.3390/jpm12071055.
9
Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques.使用机器学习技术预测中国老年人患2型糖尿病的风险
J Pers Med. 2022 May 31;12(6):905. doi: 10.3390/jpm12060905.
10
A Noninvasive Prediction Model for Hepatitis B Virus Disease in Patients with HIV: Based on the Population of Jiangsu, China.基于中国江苏人群的 HIV 合并乙型肝炎病毒感染者疾病进展的无创预测模型。
Biomed Res Int. 2021 Mar 29;2021:6696041. doi: 10.1155/2021/6696041. eCollection 2021.
Classification and prediction of diabetes disease using machine learning paradigm.
使用机器学习范式对糖尿病疾病进行分类和预测。
Health Inf Sci Syst. 2020 Jan 3;8(1):7. doi: 10.1007/s13755-019-0095-z. eCollection 2020 Dec.
4
Predicting gamma passing rates for portal dosimetry-based IMRT QA using machine learning.使用机器学习预测基于门控剂量学的调强放射治疗 QA 的伽马通过率。
Med Phys. 2019 Oct;46(10):4666-4675. doi: 10.1002/mp.13752. Epub 2019 Aug 27.
5
Big data and machine learning algorithms for health-care delivery.大数据和机器学习算法在医疗中的应用。
Lancet Oncol. 2019 May;20(5):e262-e273. doi: 10.1016/S1470-2045(19)30149-4.
6
Identification of Potential Type II Diabetes in a Chinese Population with a Sensitive Decision Tree Approach.基于敏感决策树方法的中国人群 2 型糖尿病潜在性识别。
J Diabetes Res. 2019 Jan 22;2019:4248218. doi: 10.1155/2019/4248218. eCollection 2019.
7
Predicting Future Driving Risk of Crash-Involved Drivers Based on a Systematic Machine Learning Framework.基于系统机器学习框架预测涉及碰撞的驾驶员未来驾驶风险。
Int J Environ Res Public Health. 2019 Jan 25;16(3):334. doi: 10.3390/ijerph16030334.
8
A combined drug discovery strategy based on machine learning and molecular docking.基于机器学习和分子对接的联合药物发现策略。
Chem Biol Drug Des. 2019 May;93(5):685-699. doi: 10.1111/cbdd.13494. Epub 2019 Mar 7.
9
Neural networks for mining the associations between diseases and symptoms in clinical notes.用于挖掘临床记录中疾病与症状之间关联的神经网络。
Health Inf Sci Syst. 2018 Nov 28;7(1):1. doi: 10.1007/s13755-018-0062-0. eCollection 2019 Dec.
10
Transfer learning based histopathologic image classification for breast cancer detection.基于迁移学习的乳腺癌检测组织病理学图像分类
Health Inf Sci Syst. 2018 Sep 28;6(1):18. doi: 10.1007/s13755-018-0057-x. eCollection 2018 Dec.