Lee Minhyuk, Park Taesung, Shin Ji-Yeon, Park Mira
Department of Statistics, Korea University, Seoul, Republic of Korea.
Department of Statistics, Seoul National University, Seoul, Republic of Korea.
Sci Rep. 2024 Aug 1;14(1):17851. doi: 10.1038/s41598-024-68541-1.
Metabolic syndrome (MetS) is a complex disorder characterized by a cluster of metabolic abnormalities, including abdominal obesity, hypertension, elevated triglycerides, reduced high-density lipoprotein cholesterol, and impaired glucose tolerance. It poses a significant public health concern, as individuals with MetS are at an increased risk of developing cardiovascular diseases and type 2 diabetes. Early and accurate identification of individuals at risk for MetS is essential. Various machine learning approaches have been employed to predict MetS, such as logistic regression, support vector machines, and several boosting techniques. However, these methods use MetS as a binary status and do not consider that MetS comprises five components. Therefore, a method that focuses on these characteristics of MetS is needed. In this study, we propose a multi-task deep learning model designed to predict MetS and its five components simultaneously. The benefit of multi-task learning is that it can manage multiple tasks with a single model, and learning related tasks may enhance the model's predictive performance. To assess the efficacy of our proposed method, we compared its performance with that of several single-task approaches, including logistic regression, support vector machine, CatBoost, LightGBM, XGBoost and one-dimensional convolutional neural network. For the construction of our multi-task deep learning model, we utilized data from the Korean Association Resource (KARE) project, which includes 352,228 single nucleotide polymorphisms (SNPs) from 7729 individuals. We also considered lifestyle, dietary, and socio-economic factors that affect chronic diseases, in addition to genomic data. By evaluating metrics such as accuracy, precision, F1-score, and the area under the receiver operating characteristic curve, we demonstrate that our multi-task learning model surpasses traditional single-task machine learning models in predicting MetS.
代谢综合征(MetS)是一种复杂的病症,其特征为一系列代谢异常,包括腹型肥胖、高血压、甘油三酯升高、高密度脂蛋白胆固醇降低以及糖耐量受损。它引起了重大的公共卫生关注,因为患有代谢综合征的个体患心血管疾病和2型糖尿病的风险增加。早期准确识别有代谢综合征风险的个体至关重要。已经采用了各种机器学习方法来预测代谢综合征,如逻辑回归、支持向量机和几种提升技术。然而,这些方法将代谢综合征视为二元状态,并未考虑到代谢综合征由五个组成部分构成。因此,需要一种关注代谢综合征这些特征的方法。在本研究中,我们提出了一种多任务深度学习模型,旨在同时预测代谢综合征及其五个组成部分。多任务学习的好处在于它可以用单个模型管理多个任务,并且学习相关任务可能会提高模型的预测性能。为了评估我们提出的方法的有效性,我们将其性能与几种单任务方法进行了比较,包括逻辑回归、支持向量机、CatBoost、LightGBM、XGBoost和一维卷积神经网络。为了构建我们的多任务深度学习模型,我们利用了韩国协会资源(KARE)项目的数据,该项目包括来自7729名个体的352,228个单核苷酸多态性(SNP)。除了基因组数据外,我们还考虑了影响慢性病的生活方式、饮食和社会经济因素。通过评估准确率、精确率、F1分数和受试者工作特征曲线下面积等指标,我们证明了我们的多任务学习模型在预测代谢综合征方面优于传统的单任务机器学习模型。