• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用日本匿名生活普查数据进行可解释的机器学习分析以识别糖尿病风险因素。

Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan.

作者信息

Jiang Pei, Suzuki Hiroyuki, Obi Takashi

机构信息

Course of Information and Communication, Department of Engineer, Tokyo Institute of Technology, Kanagawa, Japan.

Present Address: 4259 Nagatsutachou, Midori Ward, Yokohama, Kanagawa, 226-0026 Japan.

出版信息

Health Technol (Berl). 2023;13(1):119-131. doi: 10.1007/s12553-023-00730-w. Epub 2023 Jan 26.

DOI:10.1007/s12553-023-00730-w
PMID:36718178
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9876749/
Abstract

PURPOSE

Diabetes mellitus causes various problems in our life. With the big data boom in our society, some risk factors for Diabetes must still exist. To identify new risk factors for diabetes in the big data society and explore further efficient use of big data, the non-objective-oriented census data about the Japanese Citizen's Survey of Living Conditions were analyzed using interpretable machine learning methods.

METHODS

Seven interpretable machine learning methods were used to analysis Japan citizens' census data. Firstly, logistic analysis was used to analyze the risk factors of diabetes from 19 selected initial elements. Then, the linear analysis, linear discriminate analysis, Hayashi's quantification analysis method 2, random forest, XGBoost, and SHAP methods were used to re-check and find the different factor contributions. Finally, the relationship among the factors was analyzed to understand the relationship among factors.

RESULTS

Four new risk factors: the number of family members, insurance type, public pension type, and health awareness level, were found as risk factors for diabetes mellitus for the first time, while another 11 risk factors were reconfirmed in this analysis. Especially the insurance type factor and health awareness level factor make more contributions to diabetes than factors: hypertension, hyperlipidemia, and stress in some interpretable models. We also found that work years were identified as a risk factor for diabetes because it has a high coefficient with the risk factor of age.

CONCLUSIONS

New risk factors for diabetes mellitus were identified based on Japan's non-objective-oriented anonymous census data using interpretable machine learning models. The newly identified risk factors inspire new possible policies for preventing diabetes. Moreover, our analysis certifies that big data can help us find helpful knowledge in today's prosperous society. Our study also paves the way for identifying more risk factors and promoting the efficiency of using big data.

摘要

目的

糖尿病在我们的生活中引发了各种问题。随着社会大数据热潮的兴起,糖尿病的一些风险因素想必依然存在。为了在大数据社会中识别糖尿病的新风险因素,并探索进一步有效利用大数据的方法,我们使用可解释的机器学习方法,对日本公民生活状况调查的非目标导向型普查数据进行了分析。

方法

使用七种可解释的机器学习方法来分析日本公民的普查数据。首先,采用逻辑分析从19个选定的初始因素中分析糖尿病的风险因素。然后,使用线性分析、线性判别分析、林氏量化分析方法2、随机森林、XGBoost和SHAP方法进行重新检查,并找出不同因素的贡献。最后,分析各因素之间的关系,以了解因素之间的关联。

结果

首次发现四个新的风险因素:家庭成员数量、保险类型、公共养老金类型和健康意识水平为糖尿病的风险因素,同时在本次分析中再次确认了另外11个风险因素。特别是在一些可解释模型中,保险类型因素和健康意识水平因素对糖尿病的影响比高血压、高脂血症和压力等因素更大。我们还发现工作年限被确定为糖尿病的一个风险因素,因为它与年龄风险因素的系数较高。

结论

利用可解释的机器学习模型,基于日本非目标导向型匿名普查数据识别出了糖尿病的新风险因素。新发现的风险因素为预防糖尿病激发了新的可能政策。此外,我们的分析证明,大数据能够帮助我们在当今繁荣的社会中找到有用的知识。我们的研究也为识别更多风险因素以及提高大数据使用效率铺平了道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5f2/9876749/af01f1ca453d/12553_2023_730_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5f2/9876749/88e7597d883b/12553_2023_730_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5f2/9876749/c5acd690a8eb/12553_2023_730_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5f2/9876749/af01f1ca453d/12553_2023_730_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5f2/9876749/88e7597d883b/12553_2023_730_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5f2/9876749/c5acd690a8eb/12553_2023_730_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5f2/9876749/af01f1ca453d/12553_2023_730_Fig3_HTML.jpg

相似文献

1
Interpretable machine learning analysis to identify risk factors for diabetes using the anonymous living census data of Japan.利用日本匿名生活普查数据进行可解释的机器学习分析以识别糖尿病风险因素。
Health Technol (Berl). 2023;13(1):119-131. doi: 10.1007/s12553-023-00730-w. Epub 2023 Jan 26.
2
Identifying the most crucial factors associated with depression based on interpretable machine learning: a case study from CHARLS.基于可解释机器学习识别与抑郁症相关的最关键因素:来自中国健康与养老追踪调查(CHARLS)的案例研究
Front Psychol. 2024 Jul 25;15:1392240. doi: 10.3389/fpsyg.2024.1392240. eCollection 2024.
3
Interpretable machine learning for 28-day all-cause in-hospital mortality prediction in critically ill patients with heart failure combined with hypertension: A retrospective cohort study based on medical information mart for intensive care database-IV and eICU databases.用于预测心力衰竭合并高血压重症患者28天全因院内死亡率的可解释机器学习:一项基于重症监护医学信息集市数据库-IV和电子重症监护病房数据库的回顾性队列研究
Front Cardiovasc Med. 2022 Oct 12;9:994359. doi: 10.3389/fcvm.2022.994359. eCollection 2022.
4
Interpretable machine learning identifies metabolites associated with glomerular filtration rate in type 2 diabetes patients.可解释机器学习确定 2 型糖尿病患者肾小球滤过率相关的代谢物。
Front Endocrinol (Lausanne). 2024 Jun 10;15:1279034. doi: 10.3389/fendo.2024.1279034. eCollection 2024.
5
Interpretable prediction of 3-year all-cause mortality in patients with chronic heart failure based on machine learning.基于机器学习的慢性心力衰竭患者 3 年全因死亡率的可解释预测。
BMC Med Inform Decis Mak. 2023 Nov 20;23(1):267. doi: 10.1186/s12911-023-02371-5.
6
XGBoost-SHAP-based interpretable diagnostic framework for alzheimer's disease.基于 XGBoost-SHAP 的阿尔茨海默病可解释诊断框架。
BMC Med Inform Decis Mak. 2023 Jul 25;23(1):137. doi: 10.1186/s12911-023-02238-9.
7
Predictive model and risk analysis for diabetic retinopathy using machine learning: a retrospective cohort study in China.基于机器学习的糖尿病视网膜病变预测模型与风险分析:中国的回顾性队列研究。
BMJ Open. 2021 Nov 26;11(11):e050989. doi: 10.1136/bmjopen-2021-050989.
8
Predictive model and risk analysis for peripheral vascular disease in type 2 diabetes mellitus patients using machine learning and shapley additive explanation.基于机器学习和 Shapley 加法解释的 2 型糖尿病患者外周血管疾病预测模型和风险分析。
Front Endocrinol (Lausanne). 2024 Feb 28;15:1320335. doi: 10.3389/fendo.2024.1320335. eCollection 2024.
9
Prediction of Chronic Stress and Protective Factors in Adults: Development of an Interpretable Prediction Model Based on XGBoost and SHAP Using National Cross-sectional DEGS1 Data.成人慢性应激及保护因素的预测:基于XGBoost和SHAP并使用全国横断面DEGS1数据开发可解释的预测模型
JMIR AI. 2023 May 16;2:e41868. doi: 10.2196/41868.
10
Prediction of Online Psychological Help-Seeking Behavior During the COVID-19 Pandemic: An Interpretable Machine Learning Method.预测 COVID-19 大流行期间的在线心理求助行为:一种可解释的机器学习方法。
Front Public Health. 2022 Mar 3;10:814366. doi: 10.3389/fpubh.2022.814366. eCollection 2022.

本文引用的文献

1
An explanatory analytics framework for early detection of chronic risk factors in pandemics.一种用于在大流行中早期检测慢性风险因素的解释性分析框架。
Healthc Anal (N Y). 2022 Nov;2:100020. doi: 10.1016/j.health.2022.100020. Epub 2022 Jan 10.
2
Explainable diabetes classification using hybrid Bayesian-optimized TabNet architecture.使用混合贝叶斯优化的TabNet架构进行可解释的糖尿病分类
Comput Biol Med. 2022 Dec;151(Pt A):106178. doi: 10.1016/j.compbiomed.2022.106178. Epub 2022 Oct 6.
3
Analysis of risk factors correlated with angiographic vasospasm in patients with aneurysmal subarachnoid hemorrhage using explainable predictive modeling.
采用可解释预测模型分析与动脉瘤性蛛网膜下腔出血患者血管痉挛相关的风险因素。
J Clin Neurosci. 2021 Sep;91:334-342. doi: 10.1016/j.jocn.2021.07.028. Epub 2021 Jul 28.
4
Prevalence of type 2 diabetes mellitus in a representative sample of Greek adults and its association with modifiable risk factors: results from the Hellenic National Nutrition and Health Survey.希腊成年人中 2 型糖尿病的患病率及其与可改变的危险因素的关系:来自希腊国家营养与健康调查的结果。
Public Health. 2021 Aug;197:75-82. doi: 10.1016/j.puhe.2020.10.002. Epub 2021 Jan 19.
5
An Interpretable Prediction Model for Identifying N-Methylguanosine Sites Based on XGBoost and SHAP.一种基于XGBoost和SHAP的用于识别N-甲基鸟苷位点的可解释预测模型。
Mol Ther Nucleic Acids. 2020 Aug 25;22:362-372. doi: 10.1016/j.omtn.2020.08.022. eCollection 2020 Dec 4.
6
A diabetes risk index for small areas in England.英格兰小区域的糖尿病风险指数。
Health Place. 2020 May;63:102340. doi: 10.1016/j.healthplace.2020.102340. Epub 2020 Apr 23.
7
Diabetes and COVID-19: evidence, current status and unanswered research questions.糖尿病与 COVID-19:证据、现状及待解决的研究问题。
Eur J Clin Nutr. 2020 Jun;74(6):864-870. doi: 10.1038/s41430-020-0652-1. Epub 2020 May 13.
8
Toward safer highways, application of XGBoost and SHAP for real-time accident detection and feature analysis.为了更安全的高速公路,应用 XGBoost 和 SHAP 进行实时事故检测和特征分析。
Accid Anal Prev. 2020 Mar;136:105405. doi: 10.1016/j.aap.2019.105405. Epub 2019 Dec 20.
9
Definitions, methods, and applications in interpretable machine learning.可解释机器学习中的定义、方法和应用。
Proc Natl Acad Sci U S A. 2019 Oct 29;116(44):22071-22080. doi: 10.1073/pnas.1900654116. Epub 2019 Oct 16.
10
Risk factors for gestational diabetes: An umbrella review of meta-analyses of observational studies.妊娠期糖尿病的危险因素:观察性研究荟萃分析的伞式评价。
PLoS One. 2019 Apr 19;14(4):e0215372. doi: 10.1371/journal.pone.0215372. eCollection 2019.