Fredriksson Alma, Fulcher Isabel R, Russell Allyson L, Li Tracey, Tsai Yi-Ting, Seif Samira S, Mpembeni Rose N, Hedt-Gauthier Bethany
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States.
Department of Global Health and Social Medicine, Harvard Medical School, Boston, MA, United States.
Front Digit Health. 2022 Aug 17;4:855236. doi: 10.3389/fdgth.2022.855236. eCollection 2022.
Maternal and neonatal health outcomes in low- and middle-income countries (LMICs) have improved over the last two decades. However, many pregnant women still deliver at home, which increases the health risks for both the mother and the child. Community health worker programs have been broadly employed in LMICs to connect women to antenatal care and delivery locations. More recently, employment of digital tools in maternal health programs have resulted in better care delivery and served as a routine mode of data collection. Despite the availability of rich, patient-level data within these digital tools, there has been limited utilization of this type of data to inform program delivery in LMICs.
We use program data from 38,787 women enrolled in Safer Deliveries a community health worker program in Zanzibar, to build a generalizable prediction model that accurately predicts whether a newly enrolled pregnant woman will deliver in a health facility. We use information collected during the enrollment visit, including demographic data, health characteristics and current pregnancy information. We apply four machine learning methods: logistic regression, LASSO regularized logistic regression, random forest and an artificial neural network; and three sampling techniques to address the imbalanced data: undersampling of facility deliveries, oversampling of home deliveries and addition of synthetic home deliveries using SMOTE.
Our models correctly predicted the delivery location for 68%-77% of the women in the test set, with slightly higher accuracy when predicting facility delivery versus home delivery. A random forest model with a balanced training set created using undersampling of existing facility deliveries accurately identified 74.4% of women delivering at home.
This model can provide a "real-time" prediction of the delivery location for new maternal health program enrollees and may enable early provision of extra support for individuals at risk of not delivering in a health facility, which has potential to improve health outcomes for both mothers and their newborns. The framework presented here is applicable in other contexts and the selection of input features can easily be adapted to match data availability and other outcomes, both within and beyond maternal health.
在过去二十年中,低收入和中等收入国家(LMICs)的孕产妇和新生儿健康状况有所改善。然而,许多孕妇仍在家中分娩,这增加了母亲和孩子的健康风险。社区卫生工作者项目已在低收入和中等收入国家广泛采用,以帮助妇女获得产前护理和分娩地点。最近,在孕产妇健康项目中使用数字工具提高了护理质量,并成为常规的数据收集方式。尽管这些数字工具中存在丰富的患者层面数据,但在低收入和中等收入国家,这类数据在为项目实施提供信息方面的利用有限。
我们使用来自坦桑尼亚桑给巴尔“安全分娩”社区卫生工作者项目中38787名妇女的项目数据,构建一个可推广的预测模型,以准确预测新登记的孕妇是否会在医疗机构分娩。我们使用登记访视期间收集的信息,包括人口统计学数据、健康特征和当前妊娠信息。我们应用四种机器学习方法:逻辑回归、LASSO正则化逻辑回归、随机森林和人工神经网络;以及三种抽样技术来处理不平衡数据:医疗机构分娩的欠采样、家庭分娩的过采样以及使用SMOTE添加合成家庭分娩数据。
我们的模型正确预测了测试集中68%-77%妇女的分娩地点,预测医疗机构分娩的准确率略高于家庭分娩。通过对现有医疗机构分娩数据进行欠采样创建的平衡训练集的随机森林模型准确识别了74.4%在家分娩的妇女。
该模型可以对新参与孕产妇健康项目的人员的分娩地点进行“实时”预测,并可能为有不在医疗机构分娩风险的个人提前提供额外支持,这有可能改善母亲及其新生儿的健康状况。这里提出的框架适用于其他情况,输入特征的选择可以很容易地调整以匹配数据可用性和其他结果,无论是在孕产妇健康领域内还是之外。