Zheng Miaobing, Zhang Yuxin, Laws Rachel A, Vuillermin Peter, Dodd Jodie, Wen Li Ming, Baur Louise A, Taylor Rachael, Byrne Rebecca, Ponsonby Anne-Louise, Hesketh Kylie D
Institute for Physical Activity and Nutrition, School of Exercise and Nutrition Sciences, Deakin University, Geelong, Australia.
School of Health Sciences, Faculty of Health & Medicine, UNSW Sydney, Wallace Wurth Building, Kensington, 2330, Australia, 61 0290659337.
JMIR Public Health Surveill. 2025 Jun 18;11:e69220. doi: 10.2196/69220.
Rapid weight gain (RWG) during infancy, defined as an upward crossing of one centile line on a weight growth chart, is highly predictive of subsequent obesity risk. Identification of infant RWG could facilitate obesity risk assessment from infancy.
Leveraging machine learning (ML) algorithms, this study aimed to develop and validate risk prediction models to identify infant RWG by the age of 1 year.
Data from 7 Australian and New Zealand cohorts were pooled for risk model development and validation (n=5233). A total of 8 ML algorithms predicted infant RWG using routinely available prenatal and early postnatal factors, including maternal prepregnancy weight status, maternal smoking during pregnancy, gestational age, parity, infant sex, birth weight, any breastfeeding and timing of solids introduction at the age of 6 months. Pooled data were randomly split into a training dataset (70%) and a test dataset (30%) for model training and validation, respectively. Model consistency was evaluated using 5-fold cross-validation. Model predictive performance was evaluated by area under the receiver operating characteristic (ROC) curve (AUC), accuracy, precision, sensitivity, specificity, and Cohen κ.
The average prevalence of infant RWG was 27%. In the training dataset, all ML algorithms showed acceptable to excellent discrimination with AUCs ranging from 0.75 to 0.86. Accuracy, which indicates the overall correctness of the model, ranged from 0.69 to 0.78. Precision, which measures the model's ability to avoid false positives, ranged from 0.68 to 0.77. The spread of sensitivity, specificity, and Cohen κ of all models was 0.68-0.80, 0.65-0.78, and 0.38-0.56, respectively. Of the 8 algorithms, the Gradient Boosting model showed the most favorable predictive accuracy. Validation of the Gradient Boosting model in the testing dataset exhibited excellent discrimination (AUC 0.3-0.6) and good ability to make accurate predictions, particularly true positive cases (with accuracy and sensitivity>0.75), but modest performance for precision (0.57-0.60) and Cohen κ (0.47-0.52).
This study developed the first set of ML-based risk prediction models to identify infants' risk of experiencing RWG by the age of 1 year with acceptable accuracy. The models could be feasibly integrated into routine child growth monitoring and may facilitate population-wide early obesity risk assessment in primary health care.
婴儿期快速体重增加(RWG)定义为体重生长曲线上跨越一条百分位线,是后续肥胖风险的高度预测指标。识别婴儿期的RWG有助于从婴儿期开始进行肥胖风险评估。
本研究利用机器学习(ML)算法,旨在开发并验证风险预测模型,以识别1岁时婴儿发生RWG的风险。
汇总来自7个澳大利亚和新西兰队列的数据用于风险模型的开发和验证(n = 5233)。共有8种ML算法使用常规可得的产前和产后早期因素预测婴儿期的RWG,这些因素包括母亲孕前体重状况、孕期母亲吸烟情况、孕周、产次、婴儿性别、出生体重、是否进行母乳喂养以及6个月时开始添加辅食的时间。汇总数据被随机分为训练数据集(70%)和测试数据集(30%),分别用于模型训练和验证。使用五折交叉验证评估模型的一致性。通过受试者操作特征(ROC)曲线下面积(AUC)、准确性、精确性、敏感性、特异性和科恩κ系数评估模型的预测性能。
婴儿期RWG的平均患病率为27%。在训练数据集中,所有ML算法的判别能力均可接受或良好,AUC范围为0.75至0.86。准确性表示模型的总体正确性,范围为0.69至0.78。精确性衡量模型避免假阳性的能力,范围为0.68至0.77。所有模型的敏感性、特异性和科恩κ系数的范围分别为0.68 - 0.80、0.65 - 0.78和0.38 - 0.56。在这8种算法中,梯度提升模型显示出最有利的预测准确性。梯度提升模型在测试数据集中的验证表现出良好的判别能力(AUC为0.3 - 0.6)和进行准确预测的良好能力,尤其是真阳性病例(准确性和敏感性>0.75),但精确性(0.57 - 0.60)和科恩κ系数(0.47 - 0.52)表现一般。
本研究开发了第一套基于ML的风险预测模型,以可接受的准确性识别1岁时婴儿发生RWG的风险。这些模型可 feasibly 整合到常规儿童生长监测中,并可能有助于在初级卫生保健中进行全人群早期肥胖风险评估。 (注:原文中“feasibly”翻译为“可行地”,但放在句中语义不太通顺,根据语境推测这里可能是想说“可以合理地”,但按照要求未添加解释,保留原文翻译)