基于机器学习的可解释风险模型识别超重人群的影响因素：一项大型回顾性队列研究。

Identification of influence factors in overweight population through an interpretable risk model based on machine learning: a large retrospective cohort.

机构信息

Department of Endocrinology, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital, FuZhou, 350001, PR China.

Department of Critical Care Medicine, Shengli Clinical Medical College of Fujian Medical University, Fujian Provincial Hospital South Branch, Fujian Provincial Hospital Jinshan Branch, Fujian Provincial Hospital, Fuzhou, 350001, PR China.

出版信息

Endocrine. 2024 Mar;83(3):604-614. doi: 10.1007/s12020-023-03536-y. Epub 2023 Sep 30.

DOI:10.1007/s12020-023-03536-y

PMID:37776483

Abstract

BACKGROUND

The identification of associated overweight risk factors is crucial to future health risk predictions and behavioral interventions. Several consensus problems remain in machine learning, such as cross-validation, and the resulting model may suffer from overfitting or poor interpretability.

METHODS

This study employed nine commonly used machine learning methods to construct overweight risk models. The general community are the target of this study, and a total of 10,905 Chinese subjects from Ningde City in Fujian province, southeast China, participated. The best model was selected through appropriate verification and validation and was suitably explained.

RESULTS

The overweight risk models employing machine learning exhibited good performance. It was concluded that CatBoost, which is used in the construction of clinical risk models, may surpass previous machine learning methods. The visual display of the Shapley additive explanation value for the machine model variables accurately represented the influence of each variable in the model.

CONCLUSIONS

The construction of an overweight risk model using machine learning may currently be the best approach. Moreover, CatBoost may be the best machine learning method. Furthermore, combining Shapley's additive explanation and machine learning methods can be effective in identifying disease risk factors for prevention and control.

摘要

背景

识别相关的超重危险因素对于未来的健康风险预测和行为干预至关重要。机器学习中仍然存在一些共识问题，例如交叉验证，由此产生的模型可能存在过拟合或可解释性差的问题。

方法

本研究采用了九种常用的机器学习方法来构建超重风险模型。本研究的目标人群是一般社区，共有来自中国东南部福建省宁德市的 10905 名中国受试者参与。通过适当的验证和确认选择了最佳模型，并进行了适当的解释。

结果

采用机器学习的超重风险模型表现出良好的性能。可以得出结论，在构建临床风险模型时使用的 CatBoost 可能超过了以前的机器学习方法。对机器模型变量的 Shapley 加法解释值的可视化显示准确地表示了模型中每个变量的影响。

结论

使用机器学习构建超重风险模型可能是目前最好的方法。此外，CatBoost 可能是最好的机器学习方法。此外，结合 Shapley 的加法解释和机器学习方法可以有效地识别疾病风险因素，以进行预防和控制。

相似文献

Identification of influence factors in overweight population through an interpretable risk model based on machine learning: a large retrospective cohort.基于机器学习的可解释风险模型识别超重人群的影响因素：一项大型回顾性队列研究。

Endocrine. 2024 Mar;83(3):604-614. doi: 10.1007/s12020-023-03536-y. Epub 2023 Sep 30.

Predicting risk of obesity in overweight adults using interpretable machine learning algorithms.使用可解释的机器学习算法预测超重成年人的肥胖风险。

Front Endocrinol (Lausanne). 2023 Nov 17;14:1292167. doi: 10.3389/fendo.2023.1292167. eCollection 2023.

Predicting the onset of overweight in Chinese high school students: a machine-learning approach in a one-year prospective cohort study.预测中国高中生超重的发病情况：一项为期一年的前瞻性队列研究中的机器学习方法。

Endocrine. 2024 Nov;86(2):600-611. doi: 10.1007/s12020-024-03902-4. Epub 2024 Jun 10.

A Risk Prediction Model for Physical Restraints Among Older Chinese Adults in Long-term Care Facilities: Machine Learning Study.长期护理机构中老年人身体约束的风险预测模型：机器学习研究。

J Med Internet Res. 2023 Apr 6;25:e43815. doi: 10.2196/43815.

Prediction of 30-day mortality in heart failure patients with hypoxic hepatitis: Development and external validation of an interpretable machine learning model.缺氧性肝炎所致心力衰竭患者30天死亡率的预测：一种可解释机器学习模型的开发与外部验证

Front Cardiovasc Med. 2022 Oct 28;9:1035675. doi: 10.3389/fcvm.2022.1035675. eCollection 2022.

Predicting Mortality in Intensive Care Unit Patients With Heart Failure Using an Interpretable Machine Learning Model: Retrospective Cohort Study.利用可解释机器学习模型预测重症监护病房心力衰竭患者的死亡率：回顾性队列研究。

J Med Internet Res. 2022 Aug 9;24(8):e38082. doi: 10.2196/38082.

Machine Learning-Derived Prenatal Predictive Risk Model to Guide Intervention and Prevent the Progression of Gestational Diabetes Mellitus to Type 2 Diabetes: Prediction Model Development Study.机器学习衍生的产前预测风险模型，用于指导干预并预防妊娠期糖尿病进展为2型糖尿病：预测模型开发研究

JMIR Diabetes. 2022 Jul 5;7(3):e32366. doi: 10.2196/32366.

Prediction Model of Osteonecrosis of the Femoral Head After Femoral Neck Fracture: Machine Learning-Based Development and Validation Study.股骨颈骨折后股骨头坏死的预测模型：基于机器学习的开发与验证研究

JMIR Med Inform. 2021 Nov 19;9(11):e30079. doi: 10.2196/30079.

Predicting the risk of subclinical atherosclerosis based on interpretable machine models in a Chinese T2DM population.基于可解释的机器学习模型预测中国 2 型糖尿病患者亚临床动脉粥样硬化风险。

Front Endocrinol (Lausanne). 2024 Feb 27;15:1332982. doi: 10.3389/fendo.2024.1332982. eCollection 2024.

Predicting the 5-Year Risk of Nonalcoholic Fatty Liver Disease Using Machine Learning Models: Prospective Cohort Study.利用机器学习模型预测非酒精性脂肪性肝病的 5 年风险：前瞻性队列研究。

J Med Internet Res. 2023 Sep 12;25:e46891. doi: 10.2196/46891.

引用本文的文献

A Biomarker-Driven and Interpretable Machine Learning Model for Diagnosing Diabetes Mellitus.一种用于诊断糖尿病的生物标志物驱动且可解释的机器学习模型。

Food Sci Nutr. 2025 Apr 30;13(5):e70234. doi: 10.1002/fsn3.70234. eCollection 2025 May.

Developing a rapid screening tool for high-risk ICU patients of sepsis: integrating electronic medical records with machine learning methods for mortality prediction in hospitalized patients-model establishment, internal and external validation, and visualization.开发一种用于重症监护病房（ICU）脓毒症高危患者的快速筛查工具：将电子病历与机器学习方法相结合以预测住院患者的死亡率——模型建立、内部和外部验证以及可视化。

J Transl Med. 2025 Jan 21;23(1):97. doi: 10.1186/s12967-025-06102-4.

Using interpretable machine learning methods to identify the relative importance of lifestyle factors for overweight and obesity in adults: pooled evidence from CHNS and NHANES.使用可解释的机器学习方法来确定生活方式因素对成年人超重和肥胖的相对重要性：来自 CHNS 和 NHANES 的综合证据。

BMC Public Health. 2024 Nov 1;24(1):3034. doi: 10.1186/s12889-024-20510-z.

本文引用的文献

Prediction of complications of type 2 Diabetes: A Machine learning approach.预测 2 型糖尿病并发症：一种机器学习方法。

Diabetes Res Clin Pract. 2022 Aug;190:110013. doi: 10.1016/j.diabres.2022.110013. Epub 2022 Jul 21.

Predicting poor glycemic control during Ramadan among non-fasting patients with diabetes using artificial intelligence based machine learning models.利用基于人工智能的机器学习模型预测非禁食糖尿病患者在斋月期间血糖控制不佳的情况。

Diabetes Res Clin Pract. 2022 Aug;190:109982. doi: 10.1016/j.diabres.2022.109982. Epub 2022 Jul 6.

Interpretability analysis for thermal sensation machine learning models: An exploration based on the SHAP approach.热感觉机器学习模型的可解释性分析：基于SHAP方法的探索

Indoor Air. 2022 Feb;32(2):e12984. doi: 10.1111/ina.12984. Epub 2022 Jan 19.

Using CatBoost algorithm to identify middle-aged and elderly depression, national health and nutrition examination survey 2011-2018.利用 CatBoost 算法识别中老年抑郁症：2011-2018 年全国健康与营养调查。

Psychiatry Res. 2021 Dec;306:114261. doi: 10.1016/j.psychres.2021.114261. Epub 2021 Nov 1.

Nonalcoholic fatty liver disease and early prediction of gestational diabetes mellitus using machine learning methods.非酒精性脂肪肝疾病和使用机器学习方法对妊娠期糖尿病的早期预测。

Clin Mol Hepatol. 2022 Jan;28(1):105-116. doi: 10.3350/cmh.2021.0174. Epub 2021 Oct 15.

Environmental chemical exposure dynamics and machine learning-based prediction of diabetes mellitus.环境化学物质暴露动态与基于机器学习的糖尿病预测

Sci Total Environ. 2022 Feb 1;806(Pt 2):150674. doi: 10.1016/j.scitotenv.2021.150674. Epub 2021 Sep 29.

A systematic literature review on obesity: Understanding the causes & consequences of obesity and reviewing various machine learning approaches used to predict obesity.关于肥胖的系统文献综述：了解肥胖的成因与后果，并回顾用于预测肥胖的各种机器学习方法。

Comput Biol Med. 2021 Sep;136:104754. doi: 10.1016/j.compbiomed.2021.104754. Epub 2021 Aug 16.

Development of quantitative model of a local lymph node assay for evaluating skin sensitization potency applying machine learning CatBoost.应用机器学习 CatBoost 建立评价皮肤致敏强度的局部淋巴结检测定量模型。

Regul Toxicol Pharmacol. 2021 Oct;125:105019. doi: 10.1016/j.yrtph.2021.105019. Epub 2021 Jul 24.

Prediction of early childhood obesity with machine learning and electronic health record data.基于机器学习和电子健康记录数据预测儿童期肥胖。

Int J Med Inform. 2021 Jun;150:104454. doi: 10.1016/j.ijmedinf.2021.104454. Epub 2021 Apr 9.

Decision curve analysis to evaluate the clinical benefit of prediction models.决策曲线分析评估预测模型的临床获益。

Spine J. 2021 Oct;21(10):1643-1648. doi: 10.1016/j.spinee.2021.02.024. Epub 2021 Mar 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于机器学习的可解释风险模型识别超重人群的影响因素：一项大型回顾性队列研究。

Identification of influence factors in overweight population through an interpretable risk model based on machine learning: a large retrospective cohort.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献