Suppr超能文献

肯尼亚西部农村地区腹泻患儿线性生长发育迟缓的预测模型:一种可解释的机器学习方法。

Predictive modelling of linear growth faltering among pediatric patients with Diarrhea in Rural Western Kenya: an explainable machine learning approach.

作者信息

Ogwel Billy, Mzazi Vincent H, Awuor Alex O, Okonji Caleb, Anyango Raphael O, Oreso Caren, Ochieng John B, Munga Stephen, Nasrin Dilruba, Tickell Kirkby D, Pavlinac Patricia B, Kotloff Karen L, Omore Richard

机构信息

Kenya Medical Research Institute- Center for Global Health Research (KEMRI-CGHR), P.O Box 1578-40100, Kisumu, Kenya.

Department of Information Systems, University of South Africa, Pretoria, South Africa.

出版信息

BMC Med Inform Decis Mak. 2024 Dec 2;24(1):368. doi: 10.1186/s12911-024-02779-7.

Abstract

INTRODUCTION

Stunting affects one-fifth of children globally with diarrhea accounting for an estimated 13.5% of stunting. Identifying risk factors for its precursor, linear growth faltering (LGF), is critical to designing interventions. Moreover, developing new predictive models for LGF using more recent data offers opportunity to enhance model accuracy, interpretability and capture new insights. We employed machine learning (ML) to derive and validate a predictive model for LGF among children enrolled with diarrhea in the Vaccine Impact on Diarrhea in Africa (VIDA) study and the Enterics for Global Heath (EFGH) - Shigella study in rural western Kenya.

METHODS

We used 7 diverse ML algorithms to retrospectively build prognostic models for the prediction of LGF (≥ 0.5 decrease in height/length for age z-score [HAZ]) among children 6-35 months. We used de-identified data from the VIDA study (n = 1,106) combined with synthetic data (n = 8,894) in model development, which entailed split-sampling and K-fold cross-validation with over-sampling technique, and data from EFGH-Shigella study (n = 655) for temporal validation. Potential predictors (n = 65) included demographic, household-level characteristics, illness history, anthropometric and clinical data were identified using boruta feature selection with an explanatory model analysis used to enhance interpretability.

RESULTS

The prevalence of LGF in the development and temporal validation cohorts was 187 (16.9%) and 147 (22.4%), respectively. Feature selection identified the following 6 variables used in model development, ranked by importance: age (16.6%), temperature (6.0%), respiratory rate (4.1%), SAM (3.4%), rotavirus vaccination (3.3%), and skin turgor (2.1%). While all models showed good prediction capability, the gradient boosting model achieved the best performance (area under the curve % [95% Confidence Interval]: 83.5 [81.6-85.4] and 65.6 [60.8-70.4]) on the development and temporal validation datasets, respectively.

CONCLUSION

Our findings accentuate the enduring relevance of established predictors of LGF whilst demonstrating the practical utility of ML algorithms for rapid identification of at-risk children.

摘要

引言

发育迟缓影响着全球五分之一的儿童,腹泻估计占发育迟缓的13.5%。识别其先兆线性生长迟缓(LGF)的风险因素对于设计干预措施至关重要。此外,利用最新数据开发LGF的新预测模型为提高模型准确性、可解释性并获取新见解提供了机会。我们采用机器学习(ML)方法,在非洲腹泻疫苗影响(VIDA)研究和肯尼亚西部农村地区全球健康肠道疾病(EFGH)-志贺氏菌研究中,推导并验证了腹泻患儿LGF的预测模型。

方法

我们使用7种不同的ML算法,回顾性构建6至35个月儿童LGF(年龄别身高/身长Z评分[HAZ]下降≥0.5)的预测模型。在模型开发中,我们使用来自VIDA研究的去识别化数据(n = 1106)与合成数据(n = 8894)相结合,采用分割抽样和过采样技术的K折交叉验证,并使用EFGH-志贺氏菌研究的数据(n = 655)进行时间验证。潜在预测因素(n = 65)包括人口统计学、家庭层面特征、病史、人体测量和临床数据,通过博鲁塔特征选择确定,并使用解释性模型分析来提高可解释性。

结果

在开发队列和时间验证队列中,LGF的患病率分别为187例(16.9%)和147例(22.4%)。特征选择确定了模型开发中使用的以下6个变量,按重要性排序:年龄(16.6%)、体温(6.0%)、呼吸频率(4.1%)、重度急性营养不良(3.4%)、轮状病毒疫苗接种(3.3%)和皮肤弹性(2.1%)。虽然所有模型都显示出良好的预测能力,但梯度提升模型在开发数据集和时间验证数据集上分别取得了最佳性能(曲线下面积%[95%置信区间]:83.5[81.6 - 85.4]和65.6[60.8 - 70.4])。

结论

我们的研究结果强调了已确立的LGF预测因素的持久相关性,同时证明了ML算法在快速识别高危儿童方面的实际效用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验