Mathematics and Statistics Department, South Dakota State University, AME Building, Box 2225, Brookings, South Dakota 57007. Email:
Office of Research, College of Nursing, South Dakota State University, Brookings, South Dakota.
Prev Chronic Dis. 2018 Oct 25;15:E130. doi: 10.5888/pcd15.180177.
The All Women Count! (AWC!) program is a no-cost breast and cervical cancer screening program for qualifying women in South Dakota. Our study aimed to identify counties with similar socioeconomic characteristics and to estimate the number of women who will use the program for the next 5 years.
We used AWC! data and sociodemographic predictor variables (eg, poverty level [percentage of the population with an annual income at or below 200% of the Federal Poverty Level], median income) and a mixture of Gaussian regression time series models to perform clustering and forecasting simultaneously. Model selection was performed by using Bayesian information criterion (BIC). Forecasting of the predictor variables was done by using an autoregressive integrated moving average model.
By using BIC, we identified 5 clusters showing the groups of South Dakota counties with similar characteristics in terms of predictor variables and the number of participants. The mixture model identified groups of counties with increasing or decreasing trends in participation and forecast averages per cluster.
The mixture of regression time series model used in this study allowed for the identification of similar counties and provided a forecasting model for future years. Although several predictors contributed to program participation, we believe our forecasting analysis by county may provide useful information to improve the implementation of the AWC! program by informing program managers on the expected number of participants in the next 5 years. This, in turn, will help in data-driven resource allocation.
“全女性计数!”(AWC!)计划是南达科他州为符合条件的女性提供的一项免费乳腺癌和宫颈癌筛查计划。我们的研究旨在确定具有相似社会经济特征的县,并估计未来 5 年内将有多少女性使用该计划。
我们使用了 AWC!数据和社会人口统计学预测变量(例如,贫困水平[年收入在联邦贫困线 200%以下的人口比例]、中位数收入)以及高斯回归时间序列模型的混合来同时进行聚类和预测。通过贝叶斯信息准则(BIC)进行模型选择。使用自回归综合移动平均模型对预测变量进行预测。
通过使用 BIC,我们确定了 5 个聚类,这些聚类显示了南达科他州各县在预测变量和参与者数量方面具有相似特征的群组。混合模型确定了参与度呈上升或下降趋势的群组,并对每个群组的平均预测值进行了预测。
本研究中使用的回归时间序列模型的混合允许识别相似的县,并为未来几年提供预测模型。尽管有几个预测因素对项目参与度有贡献,但我们相信,我们按县进行的预测分析可以为改进 AWC!项目的实施提供有用的信息,使项目管理人员了解未来 5 年内的预期参与者人数。这反过来将有助于数据驱动的资源分配。