Department of Applied Mathematics, Kyung Hee University, Yongin, South Korea.
Department of Mathematics Education, Chonnam National University, Gwangju, South Korea.
Front Public Health. 2023 Jan 17;10:998782. doi: 10.3389/fpubh.2022.998782. eCollection 2022.
Machine Learning is a powerful tool to discover hidden information and relationships in various data-driven research fields. Obesity is an extremely complex topic, involving biological, physiological, psychological, and environmental factors. One successful approach to the topic is machine learning frameworks, which can reveal complex and essential risk factors of obesity. Over the last two decades, the obese population (BMI of above 23) in Korea has grown. The purpose of this study is to identify risk factors that predict obesity using machine learning classifiers and identify the algorithm with the best accuracy among classifiers used for obesity prediction. This work will allow people to assess obesity risk from blood tests and blood pressure data based on the KNHANES, which used data constructed by the annual survey. Our data include a total of 21,100 participants (male 10,000 and female 11,100). We assess obesity prediction by utilizing six machine learning algorithms. We explore age- and gender-specific risk factors of obesity for adults (19-79 years old). Our results highlight the four most significant features in all age-gender groups for predicting obesity: triglycerides, ALT (SGPT), glycated hemoglobin, and uric acid. Our findings show that the risk factors for obesity are sensitive to age and gender under different machine learning algorithms. Performance is highest for the 19-39 age group of both genders, with over 70% accuracy and AUC, while the 60-79 age group shows around 65% accuracy and AUC. For the 40-59 age groups, the proposed algorithm achieved over 70% in AUC, but for the female participants, it achieved lower than 70% accuracy. For all classifiers and age groups, there is no big difference in the accuracy ratio when the number of features is more than six; however, the accuracy ratio decreased in the female 19-39 age group.
机器学习是一种强大的工具,可以在各种数据驱动的研究领域中发现隐藏的信息和关系。肥胖是一个极其复杂的话题,涉及生物、生理、心理和环境因素。一种成功的方法是使用机器学习框架,它可以揭示肥胖的复杂和基本风险因素。在过去的二十年中,韩国的肥胖人口(BMI 超过 23)有所增加。本研究的目的是使用机器学习分类器识别预测肥胖的风险因素,并确定用于肥胖预测的分类器中准确性最高的算法。这项工作将使人们能够根据 KNHANES 利用血液测试和血压数据评估肥胖风险,KNHANES 使用年度调查构建的数据。我们的数据包括总共 21100 名参与者(男性 10000 名,女性 11100 名)。我们利用六种机器学习算法评估肥胖预测。我们探索成年人(19-79 岁)的年龄和性别特定的肥胖风险因素。我们的结果强调了所有年龄性别组中预测肥胖的四个最重要特征:甘油三酯、ALT(SGPT)、糖化血红蛋白和尿酸。我们的研究结果表明,肥胖的风险因素在不同的机器学习算法下对年龄和性别敏感。在两性的 19-39 岁年龄组中,性能最高,准确率和 AUC 超过 70%,而 60-79 岁年龄组的准确率和 AUC 约为 65%。对于 40-59 岁年龄组,所提出的算法在 AUC 中达到了 70%以上,但对于女性参与者,其准确率低于 70%。对于所有分类器和年龄组,当特征数超过六个时,准确性比率没有太大差异;然而,在女性 19-39 岁年龄组中,准确性比率下降。