Choe Eun Kyung, Rhee Hwanseok, Lee Seungjae, Shin Eunsoon, Oh Seung-Won, Lee Jong-Eun, Choi Seung Ho
Department of Surgery, Seoul National University Hospital, Healthcare System Gangnam Center, Seoul 06236, Korea.
DNALink, Inc., Seoul 03759, Korea.
Genomics Inform. 2018 Dec;16(4):e31. doi: 10.5808/GI.2018.16.4.e31. Epub 2018 Dec 28.
The prevalence of metabolic syndrome (MS) in the nonobese population is not low. However, the identification and risk mitigation of MS are not easy in this population. We aimed to develop an MS prediction model using genetic and clinical factors of nonobese Koreans through machine learning methods. A prediction model for MS was designed for a nonobese population using clinical and genetic polymorphism information with five machine learning algorithms, including naïve Bayes classification (NB). The analysis was performed in two stages (training and test sets). Model A was designed with only clinical information (age, sex, body mass index, smoking status, alcohol consumption status, and exercise status), and for model B, genetic information (for 10 polymorphisms) was added to model A. Of the 7,502 nonobese participants, 647 (8.6%) had MS. In the test set analysis, for the maximum sensitivity criterion, NB showed the highest sensitivity: 0.38 for model A and 0.42 for model B. The specificity of NB was 0.79 for model A and 0.80 for model B. In a comparison of the performances of models A and B by NB, model B (area under the receiver operating characteristic curve [AUC] = 0.69, clinical and genetic information input) showed better performance than model A (AUC = 0.65, clinical information only input). We designed a prediction model for MS in a nonobese population using clinical and genetic information. With this model, we might convince nonobese MS individuals to undergo health checks and adopt behaviors associated with a preventive lifestyle.
非肥胖人群中代谢综合征(MS)的患病率并不低。然而,在这一人群中识别MS并降低其风险并非易事。我们旨在通过机器学习方法,利用非肥胖韩国人的遗传和临床因素开发一个MS预测模型。使用包括朴素贝叶斯分类(NB)在内的五种机器学习算法,结合临床和基因多态性信息,为非肥胖人群设计了一个MS预测模型。分析分两个阶段进行(训练集和测试集)。模型A仅采用临床信息(年龄、性别、体重指数、吸烟状况、饮酒状况和运动状况)进行设计,模型B则在模型A的基础上增加了基因信息(10种多态性)。在7502名非肥胖参与者中,647人(8.6%)患有MS。在测试集分析中,按照最大敏感性标准,NB表现出最高的敏感性:模型A为0.38,模型B为0.42。模型A中NB的特异性为0.79,模型B为0.80。通过NB比较模型A和模型B的性能,模型B(受试者操作特征曲线下面积[AUC]=0.69,输入临床和基因信息)的表现优于模型A(AUC=0.65,仅输入临床信息)。我们利用临床和基因信息为非肥胖人群设计了一个MS预测模型。借助这个模型,我们或许能够说服非肥胖的MS患者接受健康检查,并采取与预防性生活方式相关的行为。