Applied Computer Science Department, University of Winnipeg, 515 Portage Avenue, Winnipeg, R3B 2E9, MB, Canada.
Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, 3280 Hospital Drive NW, Calgary, T2N 4Z6, AB, Canada.
BMC Med Inform Decis Mak. 2024 Nov 30;24(1):367. doi: 10.1186/s12911-024-02783-x.
Low birth weight (LBW), known as the condition of a newborn weighing less than 2500 g, is a growing concern in the United States (US). Previous studies have identified several contributing factors, but many have analyzed these variables in isolation, limiting their ability to capture the combined influence of multiple factors. Moreover, past research has predominantly focused on maternal health, demographics, and socioeconomic conditions, often neglecting paternal factors such as age, educational level, and ethnicity. Additionally, most studies have utilized localized datasets, which may not reflect the diversity of the US population. To address these gaps, this study leverages machine learning to analyze the 2022 Centers for Disease Control and Prevention's National Natality Dataset, identifying the most significant factors contributing to LBW across the US.
We combined anthropometric, socioeconomic, maternal, and paternal factors to train logistic regression, random forest, XGBoost, conditional inference tree, and attention mechanism models to predict LBW and normal birth weight (NBW) outcomes. These models were interpreted using odds ratio analysis, feature importance, partial dependence plots (PDP), and Shapley Additive Explanations (SHAP) to identify the factors most strongly associated with LBW.
Across all five models, the most consistently associated factors with birth weight were maternal height, pre-pregnancy weight, weight gain during pregnancy, and parental ethnicity. Other pregnancy-related factors, such as prenatal visits and avoiding smoking, also significantly influenced birth weight.
The relevance of maternal anthropometric factors, pregnancy weight gain, and parental ethnicity can help explain the current differences in LBW and NBW rates among various ethnic groups in the US. Ethnicities with shorter average statures, such as Asians and Hispanics, are more likely to have newborns below the World Health Organization's 2500-gram threshold. Additionally, ethnic groups with historical challenges in accessing nutrition and perinatal care face a higher risk of delivering LBW infants.
低出生体重(LBW)是指新生儿体重低于 2500 克的情况,在美国(US)越来越受到关注。先前的研究已经确定了几个促成因素,但许多研究都是孤立地分析这些变量,限制了它们捕捉多个因素综合影响的能力。此外,过去的研究主要集中在孕产妇健康、人口统计学和社会经济状况上,往往忽略了父亲的因素,如年龄、教育水平和种族。此外,大多数研究都使用了局部数据集,这些数据集可能无法反映美国人口的多样性。为了解决这些差距,本研究利用机器学习分析 2022 年疾病控制与预防中心的国家出生率数据集,确定导致全美 LBW 的最重要因素。
我们结合了人体测量学、社会经济、孕产妇和父亲因素,训练逻辑回归、随机森林、XGBoost、条件推断树和注意力机制模型,以预测 LBW 和正常出生体重(NBW)的结果。这些模型通过比值比分析、特征重要性、偏依赖图(PDP)和 Shapley 可加解释(SHAP)进行解释,以确定与 LBW 最密切相关的因素。
在所有五个模型中,与出生体重最一致相关的因素是母亲的身高、孕前体重、孕期体重增加和父母的种族。其他与妊娠相关的因素,如产前检查和避免吸烟,也显著影响了出生体重。
孕产妇人体测量因素、孕期体重增加和父母种族的相关性有助于解释目前美国不同族裔之间 LBW 和 NBW 率的差异。平均身高较短的族裔,如亚洲人和西班牙裔,其新生儿低于世界卫生组织 2500 克阈值的可能性更大。此外,在获得营养和围产期保健方面存在历史挑战的族裔群体,面临着分娩 LBW 婴儿的更高风险。