Suppr超能文献

使用来自韩国中年人群的人体测量学、生活方式和生化因素的机器学习模型预测代谢和前代谢综合征。

Prediction of metabolic and pre-metabolic syndromes using machine learning models with anthropometric, lifestyle, and biochemical factors from a middle-aged population in Korea.

机构信息

KM Data Division, Korea Institute of Oriental Medicine, 1672 Yuseongdae-ro, Yuseong-gu, Daejeon, Republic of Korea.

出版信息

BMC Public Health. 2022 Apr 6;22(1):664. doi: 10.1186/s12889-022-13131-x.

Abstract

BACKGROUND

Metabolic syndrome (MetS) is a complex condition that appears as a cluster of metabolic abnormalities, and is closely associated with the prevalence of various diseases. Early prediction of the risk of MetS in the middle-aged population provides greater benefits for cardiovascular disease-related health outcomes. This study aimed to apply the latest machine learning techniques to find the optimal MetS prediction model for the middle-aged Korean population.

METHODS

We retrieved 20 data types from the Korean Medicine Daejeon Citizen Cohort, a cohort study on a community-based population of adults aged 30-55 years. The data included sex, age, anthropometric data, lifestyle-related data, and blood indicators of 1991 individuals. Participants satisfying two (pre-MetS) or ≥ 3 (MetS) of the five NECP-ATP III criteria were included in the MetS group. MetS prediction used nine machine learning models based on the following algorithms: Decision tree, Gaussian Naïve Bayes, K-nearest neighbor, eXtreme gradient boosting (XGBoost), random forest, logistic regression, support vector machine, multi-layer perceptron, and 1D convolutional neural network. All analyses were performed by sequentially inputting the features in three steps according to their characteristics. The models' performances were compared after applying the synthetic minority oversampling technique (SMOTE) to resolve data imbalance.

RESULTS

MetS was detected in 33.85% of the subjects. Among the MetS prediction models, the tree-based random forest and XGBoost models showed the best performance, which improved with the number of features used. As a measure of the models' performance, the area under the receiver operating characteristic curve (AUC) increased by up to 0.091 when the SMOTE was applied, with XGBoost showing the highest AUC of 0.851. Body mass index and waist-to-hip ratio were identified as the most important features in the MetS prediction models for this population.

CONCLUSIONS

Tree-based machine learning models were useful in identifying MetS with high accuracy in middle-aged Koreans. Early diagnosis of MetS is important and requires a multidimensional approach that includes self-administered questionnaire, anthropometric, and biochemical measurements.

摘要

背景

代谢综合征(MetS)是一种复杂的病症,表现为代谢异常的聚集,与各种疾病的患病率密切相关。对中年人群 MetS 风险进行早期预测,对心血管疾病相关健康结局具有更大的益处。本研究旨在应用最新的机器学习技术,为韩国中年人群找到最佳的 MetS 预测模型。

方法

我们从大邱市公民队列研究中检索了 20 种数据类型,这是一项针对 30-55 岁成年人的基于社区的人群队列研究。数据包括 1991 名参与者的性别、年龄、人体测量学数据、生活方式相关数据以及血液指标。满足以下 5 项 NECP-ATP III 标准中的两项(前 MetS)或≥3 项(MetS)的参与者被纳入 MetS 组。使用基于以下算法的 9 种机器学习模型进行 MetS 预测:决策树、高斯朴素贝叶斯、K 近邻、极端梯度提升(XGBoost)、随机森林、逻辑回归、支持向量机、多层感知器和 1D 卷积神经网络。所有分析均通过根据特征特征分三个步骤顺序输入特征来进行。应用合成少数过采样技术(SMOTE)解决数据不平衡后,比较模型的性能。

结果

33.85%的受试者被检测出患有 MetS。在 MetS 预测模型中,基于树的随机森林和 XGBoost 模型表现最佳,随着特征数量的增加,性能也得到提高。作为衡量模型性能的指标,当应用 SMOTE 时,接收者操作特征曲线下的面积(AUC)增加了 0.091,其中 XGBoost 的 AUC 最高,为 0.851。体重指数和腰臀比被确定为该人群 MetS 预测模型中最重要的特征。

结论

基于树的机器学习模型可用于准确识别韩国中年人群中的 MetS。早期诊断 MetS 很重要,需要采用多维方法,包括自我管理问卷、人体测量学和生化测量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f063/8985311/677b26d19ad2/12889_2022_13131_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验