Key Laboratory of Endocrine Glucose & Lipids Metabolism and Brain Aging, Department of Endocrinology, Ministry of Education, Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, Shandong, China.
Key Laboratory of Endocrinology of National Health Commission, Department of Endocrinology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, 100730, China.
BMC Med Inform Decis Mak. 2024 Jun 20;24(1):174. doi: 10.1186/s12911-024-02556-6.
The correlation between radiation exposure before pregnancy and abnormal birth weight has been previously proven. However, for large-for-gestational-age (LGA) babies in women exposed to radiation before becoming pregnant, there is no prediction model yet.
The data were collected from the National Free Preconception Health Examination Project in China. A sum of 455 neonates (42 SGA births and 423 non-LGA births) were included. A training set (n = 319) and a test set (n = 136) were created from the dataset at random. To develop prediction models for LGA neonates, conventional logistic regression (LR) method and six machine learning methods were used in this study. Recursive feature elimination approach was performed by choosing 10 features which made a big contribution to the prediction models. And the Shapley Additive Explanation model was applied to interpret the most important characteristics that affected forecast outputs.
The random forest (RF) model had the highest average area under the receiver-operating-characteristic curve (AUC) for predicting LGA in the test set (0.843, 95% confidence interval [CI]: 0.714-0.974). Except for the logistic regression model (AUC: 0.603, 95%CI: 0.440-0.767), other models' AUCs displayed well. Thereinto, the RF algorithm's final prediction model using 10 characteristics achieved an average AUC of 0.821 (95% CI: 0.693-0.949).
The prediction model based on machine learning might be a promising tool for the prenatal prediction of LGA births in women with radiation exposure before pregnancy.
先前已经证明,妊娠前辐射暴露与出生体重异常之间存在相关性。然而,对于在怀孕前接触过辐射的巨大儿(LGA)婴儿,目前尚无预测模型。
数据来自中国国家免费孕前健康检查项目。共纳入 455 名新生儿(42 例 SGA 出生儿和 423 例非 LGA 出生儿)。从数据集中随机创建了一个训练集(n=319)和一个测试集(n=136)。为了建立 LGA 新生儿的预测模型,本研究使用了传统的逻辑回归(LR)方法和 6 种机器学习方法。通过选择对预测模型有较大贡献的 10 个特征,采用递归特征消除方法。并应用 Shapley 加法解释模型来解释影响预测输出的最重要特征。
随机森林(RF)模型在测试集中预测 LGA 的平均受试者工作特征曲线下面积(AUC)最高(0.843,95%置信区间[CI]:0.714-0.974)。除了逻辑回归模型(AUC:0.603,95%CI:0.440-0.767),其他模型的 AUC 表现也较好。其中,使用 10 个特征的 RF 算法最终预测模型的平均 AUC 为 0.821(95%CI:0.693-0.949)。
基于机器学习的预测模型可能是预测妊娠前辐射暴露女性 LGA 分娩的一种有前途的工具。