Suppr超能文献

预测中国高中生超重的发病情况:一项为期一年的前瞻性队列研究中的机器学习方法。

Predicting the onset of overweight in Chinese high school students: a machine-learning approach in a one-year prospective cohort study.

作者信息

Zhang Zikang, Peng Wei, Sun Shaoming, Ma Jianguo, Sun Yining, Zhang Fangwen

机构信息

Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, PR China.

University of Science and Technology of China, Hefei, 230026, PR China.

出版信息

Endocrine. 2024 Nov;86(2):600-611. doi: 10.1007/s12020-024-03902-4. Epub 2024 Jun 10.

Abstract

OBJECTIVE

This study aimed to develop and evaluate machine-learning models for predicting the onset of overweight in adolescents aged 14‒17, utilizing easily collectible personal information.

METHODS

This study was a one-year prospective cohort study. Baseline data were collected through anthropometric measurements and questionnaires, and the incidence of overweight was calculated one year later via anthropometric measurements. Predictive factors were selected through univariate analysis. Six machine-learning models were developed for predicting the onset of overweight. The SHapley Additive exPlanations (SHAP) was used for global and local interpretation of the models.

RESULTS

Out of 1,241 adolescents, 204 (16.4%) were identified as overweight after one year. Nineteen features were associated with the overweight incidence in univariable analysis. Participants were randomly divided into a training group and a testing group in a 7:3 ratio. The Light Gradient Boosting Machine (LGBM) algorithm achieved outperformed other models, achieving the following metrics: Accuracy (0.956), Recall (0.812), Specificity (0.983), F1-score (0.855), AUC (0.961). Importance ranking revealed that the top 11 minimal feature set can maintain the stability of model performance.

CONCLUSIONS

The onset of overweight in adolescents was accurately predicted using easily collectible personal information. The LGBM-based model exhibited superior performance. Oversampling technique notably improved model performance. The model interpretation technique provided innovative strategies for managing adolescent overweight/obesity.

摘要

目的

本研究旨在开发并评估利用易于收集的个人信息预测14至17岁青少年超重发病情况的机器学习模型。

方法

本研究为为期一年的前瞻性队列研究。通过人体测量和问卷调查收集基线数据,并在一年后通过人体测量计算超重发病率。通过单因素分析选择预测因素。开发了六个用于预测超重发病的机器学习模型。使用SHapley加法解释(SHAP)对模型进行全局和局部解释。

结果

在1241名青少年中,一年后有204名(16.4%)被确定为超重。单因素分析中有19个特征与超重发病率相关。参与者以7:3的比例随机分为训练组和测试组。轻梯度提升机(LGBM)算法的表现优于其他模型,取得了以下指标:准确率(0.956)、召回率(0.812)、特异性(0.983)、F1分数(0.855)、曲线下面积(AUC,0.961)。重要性排序显示,前11个最小特征集可保持模型性能的稳定性。

结论

利用易于收集的个人信息可准确预测青少年超重的发病情况。基于LGBM的模型表现出卓越性能。过采样技术显著提高了模型性能。模型解释技术为管理青少年超重/肥胖提供了创新策略。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验