RAND, Pittsburgh, PA 15213, USA.
Stat Med. 2011 Feb 28;30(5):584-94. doi: 10.1002/sim.3897. Epub 2011 Feb 3.
Repeated cross-sectional samples are common in national surveys of health like the National Health Interview Survey (NHIS). Because population health outcomes generally evolve slowly, pooling data across years can improve the precision of current-year annual estimates of disease prevalence and other health outcomes. Pooling over time is particularly valuable in health disparities research, where outcomes for small groups are often of interest and pooling data across groups would bias disparity estimates. State-space modeling and Kalman filtering are appealing choices for smoothing data across time. However, filtering can be problematic when few time points are available, as is common with annual cross-sectional data. Problems arise because filtering relies on estimated variance components, which can be biased and imprecise when estimated with small samples, especially when estimated in tandem with linear trends. We conduct a simulation study showing that even when trends and variance components are estimated poorly, smoothing with these estimates can improve the mean squared error (MSE) of estimated health states for multiple racial/ethnic groups when the variance components are estimated with the pooled sample. We consider frequentist estimators with no trends, one common trend across groups, and separate trends for every group, as well as shrinkage estimators of trends through a Bayesian model. We show that the Bayesian model offers the greatest improvement in MSE, and that Bayesian Information Criterion (BIC)-based model averaging of the frequentist estimators with different trend assumptions performs nearly as well. We present empirical examples using the NHIS data.
重复的横断面样本在国家健康调查中很常见,如国家健康访谈调查(NHIS)。由于人口健康结果通常变化缓慢,因此跨年度汇总数据可以提高当前年度疾病流行率和其他健康结果的年度估计的精度。随着时间的推移进行汇总在健康差异研究中特别有价值,因为小群体的结果通常是研究的重点,并且跨群体汇总数据会使差异估计产生偏差。状态空间模型和卡尔曼滤波是跨时间平滑数据的吸引人的选择。然而,当可用的时间点很少时,过滤可能会出现问题,这在年度横断面数据中很常见。问题的出现是因为过滤依赖于估计的方差分量,当使用小样本进行估计时,这些方差分量可能会有偏差且不精确,尤其是当与线性趋势一起估计时。我们进行了一项模拟研究,表明即使趋势和方差分量估计不佳,使用这些估计值进行平滑处理也可以在方差分量使用汇总样本进行估计时,改善多个种族/族裔群体的健康状况估计的均方误差(MSE)。我们考虑了没有趋势、组间共同趋势和每个组单独趋势的常用频率趋势估计器,以及通过贝叶斯模型对趋势进行收缩估计的估计器。我们表明,贝叶斯模型提供了 MSE 的最大改善,并且基于贝叶斯信息准则(BIC)的不同趋势假设的频率趋势估计器的模型平均效果几乎相同。我们使用 NHIS 数据呈现了实证示例。