Modelling and Economics Unit, UK Health Security Agency, London, United Kingdom.
Institute of Sound and Vibration Research, University of Southampton, Southampton, United Kingdom.
PLoS Comput Biol. 2022 Mar 3;18(3):e1008858. doi: 10.1371/journal.pcbi.1008858. eCollection 2022 Mar.
The basic reproduction number (R0) of an infection determines the impact of its control. For many endemic infections, R0 is often estimated from appropriate country-specific seroprevalence data. Studies sometimes pool estimates from the same region for settings lacking seroprevalence data, but the reliability of this approach is unclear. Plausibly, indicator-based approaches could predict R0 for such settings. We calculated R0 for rubella for 98 settings and correlated its value against 66 demographic, economic, education, housing and health-related indicators. We also trained a random forest regression algorithm using these indicators as the input and R0 as the output. We used the mean-square error to compare the performances of the random forest, simple linear regression and a regional averaging method in predicting R0 using 4-fold cross validation. R0 was <5, 5-10 and >10 for 81, 14 and 3 settings respectively, with no apparent regional differences and in the limited available data, it was usually lower for rural than urban areas. R0 was most correlated with educational attainment, and household indicators for the Pearson and Spearman correlation coefficients respectively and with poverty-related indicators followed by the crude death rate considering the Maximum Information Coefficient, although the correlation for each was relatively weak (Pearson correlation coefficient: 0.4, 95%CI: (0.24,0.48) for educational attainment). A random forest did not perform better in predicting R0 than simple linear regression, depending on the subsets of training indicators and studies, and neither out-performed a regional averaging approach. R0 for rubella is typically low and using indicators to estimate its value is not straightforward. A regional averaging approach may provide as reliable an estimate of R0 for settings lacking seroprevalence data as one based on indicators. The findings may be relevant for other infections and studies estimating the disease burden and the impact of interventions for settings lacking seroprevalence data.
基本再生数(R0)是衡量传染病控制效果的重要指标。对于许多地方性传染病,R0 通常可以通过特定国家的血清流行率数据进行估算。在某些情况下,对于缺乏血清流行率数据的地区,研究人员会汇总来自同一地区的估计值,但这种方法的可靠性尚不清楚。基于指标的方法可能可以预测这些地区的 R0 值。我们计算了 98 个地区风疹的 R0 值,并将其与 66 个人口统计学、经济、教育、住房和健康相关指标进行了关联。我们还使用这些指标作为输入,R0 作为输出,训练了一个随机森林回归算法。我们使用均方误差来比较随机森林、简单线性回归和区域平均法在使用 4 折交叉验证预测 R0 时的性能。R0 值<5、5-10 和>10 的地区分别有 81、14 和 3 个,且没有明显的地区差异。在有限的可用数据中,农村地区的 R0 值通常低于城市地区。R0 值与教育程度最相关,Pearson 和 Spearman 相关系数分别为 0.4(95%CI:0.24-0.48)和 0.37(95%CI:0.27-0.47),其次是与贫困相关的指标,然后是粗死亡率,考虑到最大信息系数,尽管相关性相对较弱。对于风疹,随机森林在预测 R0 值方面并不优于简单线性回归,这取决于训练指标和研究的子集,并且两种方法都没有优于区域平均法。缺乏血清流行率数据地区的 R0 值通常较低,使用指标来估算其值并不简单。区域平均法可能为缺乏血清流行率数据的地区提供与基于指标的方法一样可靠的 R0 值估计。这些发现可能与其他传染病和研究有关,这些研究旨在评估缺乏血清流行率数据地区的疾病负担和干预措施的影响。