Leu Monica, Czene Kamila, Reilly Marie
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.
Epidemiology. 2007 Jul;18(4):433-40. doi: 10.1097/EDE.0b013e31805d8ab2.
Studies of familial aggregation of disease routinely use linked population registers to construct retrospective cohorts. Although such resources have provided numerous estimates of familial risk, little is known regarding the sensitivity of the estimates to assumed disease models, changing demographics and incidence, and incompleteness of the data. Furthermore, there are no standard tools for testing the validity of estimates from standard epidemiologic designs and from new analytic strategies using register data.
We present a method and a software package for simulating realistic populations of related individuals, using easily available vital statistics (population counts and fertility and mortality rates). The virtual population is stored in a pedigree file, allowing for easy retrieval of relatives and family structures. We simulate breast cancer in our population using age-specific incidence rates.
The Swedish population is simulated as dynamically evolving over the calendar period 1955-2002. The simulated and real population agree well on important features such as age profile, sibship size distribution, and average age at first birth. Using breast cancer as an example, we present several models of familial disease aggregation and show that the parameters used in the simulations are faithfully estimated. In addition, we illustrate how our simulated population provides insight into how incomplete family history in real register data can affect estimates of familial risk.
This simulation method can be used to investigate various underlying models of disease aggregation in families and enhance the development of optimal approaches for family studies. The software package, Population Lab, is available for free download (http://www.meb.ki.se/ approximately marrei/software/poplab/ and http://cran.at.r-project.org/).
疾病家族聚集性研究通常使用关联的人口登记册来构建回顾性队列。尽管这些资源提供了大量家族风险估计值,但对于这些估计值对假设疾病模型、不断变化的人口统计学和发病率以及数据不完整性的敏感性知之甚少。此外,对于检验标准流行病学设计和使用登记数据的新分析策略所得估计值的有效性,尚无标准工具。
我们提出了一种方法和一个软件包,用于使用易于获取的生命统计数据(人口计数、生育率和死亡率)模拟真实的相关个体群体。虚拟群体存储在一个谱系文件中,便于检索亲属和家族结构。我们使用特定年龄发病率在我们的群体中模拟乳腺癌。
瑞典人口被模拟为在1955 - 2002年日历期间动态演变。模拟人口和真实人口在年龄分布、同胞规模分布和初育平均年龄等重要特征上吻合良好。以乳腺癌为例,我们展示了几种家族性疾病聚集模型,并表明模拟中使用的参数得到了准确估计。此外,我们说明了我们的模拟群体如何有助于深入了解真实登记数据中不完整的家族史如何影响家族风险估计。
这种模拟方法可用于研究家庭中疾病聚集的各种潜在模型,并促进家庭研究最佳方法的发展。软件包Population Lab可免费下载(http://www.meb.ki.se/ approximately marrei/software/poplab/ 和http://cran.at.r-project.org/)。