Yáñez-Sepúlveda Rodrigo, Olivares Rodrigo, Olivares Pablo, Zavala-Crichton Juan Pablo, Hinojosa-Torres Claudio, Giakoni-Ramírez Frano, Souza-Lima Josivaldo de, Monsalves-Álvarez Matías, Tuesta Marcelo, Páez-Herrera Jacqueline, Olivares-Arancibia Jorge, Reyes-Amigo Tomás, Cortés-Roco Guillermo, Hurtado-Almonacid Juan, Guzmán-Muñoz Eduardo, Aguilera-Martínez Nicole, López-Gil José Francisco, Clemente-Suárez Vicente Javier
Faculty Education and Social Sciences, Universidad Andres Bello, Viña del Mar 2520000, Chile.
Escuela de Ingeniería Informática, Universidad de Valparaíso, Valparaíso 2362905, Chile.
Sports (Basel). 2025 Aug 18;13(8):273. doi: 10.3390/sports13080273.
Cardiometabolic risk in adolescents represents a growing public health concern that is closely linked to modifiable factors such as physical fitness. Traditional statistical approaches often fail to capture complex, nonlinear relationships among anthropometric and fitness-related variables.
To develop and evaluate supervised machine learning algorithms, including artificial neural networks and ensemble methods, for classifying cardiometabolic risk levels among Chilean adolescents based on standardized physical fitness assessments.
A cross-sectional analysis was conducted using a large representative sample of school-aged adolescents. Field-based physical fitness tests, such as cardiorespiratory fitness (in terms of estimated maximal oxygen consumption [VO]), muscular strength (push-ups), and explosive power (horizontal jump) testing, were used as input variables. A cardiometabolic risk index was derived using international criteria. Various supervised machine learning models were trained and compared regarding accuracy, F1 score, recall, and area under the receiver operating characteristic curve (AUC-ROC).
Among all the models tested, the gradient boosting classifier achieved the best overall performance, with an accuracy of 77.0%, an F1 score of 67.3%, and the highest AUC-ROC (0.601). These results indicate a strong balance between sensitivity and specificity in classifying adolescents at cardiometabolic risk. Horizontal jumps and push-ups emerged as the most influential predictive variables.
Gradient boosting proved to be the most effective model for predicting cardiometabolic risk based on physical fitness data. This approach offers a practical, data-driven tool for early risk detection in adolescent populations and may support scalable screening efforts in educational and clinical settings.
青少年的心脏代谢风险是一个日益受到关注的公共卫生问题,与诸如身体健康等可改变因素密切相关。传统统计方法往往无法捕捉人体测量学和健康相关变量之间复杂的非线性关系。
开发并评估监督式机器学习算法,包括人工神经网络和集成方法,以便根据标准化体能评估对智利青少年的心脏代谢风险水平进行分类。
使用具有广泛代表性的学龄青少年样本进行横断面分析。基于现场的体能测试,如心肺适能(以估计最大摄氧量[VO]表示)、肌肉力量(俯卧撑)和爆发力(立定跳远)测试,用作输入变量。根据国际标准得出心脏代谢风险指数。对各种监督式机器学习模型进行训练,并在准确性、F1分数、召回率和受试者工作特征曲线下面积(AUC-ROC)方面进行比较。
在所有测试模型中,梯度提升分类器的整体表现最佳,准确率为77.0%,F1分数为67.3%,AUC-ROC最高(0.601)。这些结果表明在对有心脏代谢风险的青少年进行分类时,敏感性和特异性之间达到了良好平衡。立定跳远和俯卧撑是最具影响力的预测变量。
梯度提升被证明是基于体能数据预测心脏代谢风险最有效的模型。这种方法为青少年人群的早期风险检测提供了一种实用的数据驱动工具,并可能支持教育和临床环境中的可扩展筛查工作。