Sutherland Jason M, Castelluccio Pete, Schwarz Carl James
Center for Health Policy Research, The Dartmouth Institute for Health Policy and Clinical Practice, Dartmouth College, Hanover, New Hampshire 03766, USA.
Biometrics. 2009 Sep;65(3):841-9. doi: 10.1111/j.1541-0420.2008.01129.x. Epub 2009 Jan 23.
Statistical methods have been developed and applied to estimating populations that are difficult or too costly to enumerate. Known as multilist methods in epidemiological settings, individuals are matched across lists and estimation of population size proceeds by modeling counts in incomplete multidimensional contingency tables (based on patterns of presence/absence on lists). As multilist methods typically assume that lists are compiled instantaneously, there are few options available for estimating the unknown size of a closed population based on continuously (longitudinally) compiled lists. However, in epidemiological settings, continuous time lists are a routine byproduct of administrative functions. Existing methods are based on time-to-event analyses with a second step of estimating population size. We propose an alternative approach to address the twofold epidemiological problem of estimating population size and of identifying patient factors related to duration (in days) between visits to a health care facility. A Bayesian framework is proposed to model interval lengths because, for many patients, the data are sparse; many patients were observed only once or twice. The proposed method is applied to the motivating data to illustrate the methods' applicability. Then, a small simulation study explores the performance of the estimator under a variety of conditions. Finally, a small discussion section suggests opportunities for continued methodological development for continuous time population estimation.
统计方法已得到发展并应用于估计难以枚举或枚举成本过高的人群。在流行病学环境中被称为多列表方法,个体在各列表间进行匹配,通过对不完整多维列联表中的计数进行建模(基于列表上的存在/缺失模式)来估计人口规模。由于多列表方法通常假设列表是即时编制的,所以基于连续(纵向)编制的列表来估计封闭人群的未知规模的方法很少。然而,在流行病学环境中,连续时间列表是行政职能的常规副产品。现有方法基于生存分析,并在第二步估计人口规模。我们提出一种替代方法来解决估计人口规模以及识别与到医疗机构就诊间隔时间(以天为单位)相关的患者因素这一双重流行病学问题。提出一个贝叶斯框架来对间隔长度进行建模,因为对于许多患者来说,数据是稀疏的;很多患者仅被观察了一两次。将所提出的方法应用于激励数据以说明该方法的适用性。然后,一个小型模拟研究探讨了估计器在各种条件下的性能。最后,一个小型讨论部分提出了持续进行连续时间人口估计方法学发展的机会。