Waltoft Berit Lindum, Pedersen Carsten Bøcker, Nyegaard Mette, Hobolth Asger
National Center for Register-based Research, Department of Economics and Business Economics, Aarhus University, Fuglesangs allé 4 room K10, 8210, Aarhus V, Denmark.
The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.
BMC Med Genet. 2015 Aug 30;16:71. doi: 10.1186/s12881-015-0210-1.
In recent years, genome wide association studies have identified many genetic variants that are consistently associated with common complex diseases, but the amount of heritability explained by these risk alleles is still low. Part of the missing heritability may be due to genetic heterogeneity and small sample sizes, but non-optimal study designs in many genome wide association studies may also have contributed to the failure of identifying gene variants causing a predisposition to disease. The normally used odds ratio from a classical case-control study measures the association between genotype and being diseased. In comparison, under incidence density sampling, the incidence rate ratio measures the association between genotype and becoming diseased. We estimate the differences between the odds ratio and the incidence rate ratio under the presence of events precluding the disease of interest. Such events may arise due to pleiotropy and are known as competing events. In addition, we investigate how these differences impact the association test.
We simulate life spans of individuals whose gene variants are subject to competing events. To estimate the association between genotype and disease, we applied classical case-control studies and incidence density sampling.
We find significant numerical differences between the odds ratio and the incidence rate ratio when the fact that gene variant may be associated with competing events, e.g. lifetime, is ignored. The only scenario showing little or no difference is an association with a rare disease and no other present associations. Furthermore, we find that p-values for association tests differed between the two study designs.
If the interest is on the aetiology of the disease, a design based on incidence density sampling provides the preferred interpretation of the estimate. Under a classical case-control design and in the presence of competing events, the change in p-values in the association test may lead to false positive findings and, more importantly, false negative findings. The ranking of the SNPs according to p-values may differ between the two study designs.
近年来,全基因组关联研究已经鉴定出许多与常见复杂疾病持续相关的基因变异,但这些风险等位基因所解释的遗传力仍然较低。部分缺失的遗传力可能归因于遗传异质性和样本量较小,但许多全基因组关联研究中不理想的研究设计也可能导致未能识别出导致疾病易感性的基因变异。经典病例对照研究中常用的优势比衡量基因型与患病之间的关联。相比之下,在发病密度抽样情况下,发病率比衡量基因型与发病之间的关联。我们估计在存在排除感兴趣疾病的事件时优势比与发病率比之间的差异。此类事件可能由于基因多效性而出现,被称为竞争事件。此外,我们研究这些差异如何影响关联检验。
我们模拟了其基因变异受竞争事件影响的个体的寿命。为了估计基因型与疾病之间的关联,我们应用了经典病例对照研究和发病密度抽样。
当忽略基因变异可能与竞争事件(如寿命)相关这一事实时,我们发现优势比与发病率比之间存在显著的数值差异。唯一显示差异很小或没有差异的情况是与罕见疾病相关且不存在其他现有关联。此外,我们发现两种研究设计的关联检验p值不同。
如果关注的是疾病的病因,基于发病密度抽样的设计能为估计提供更优的解释。在经典病例对照设计且存在竞争事件的情况下,关联检验中p值的变化可能导致假阳性结果,更重要的是,导致假阴性结果。两种研究设计中根据p值对单核苷酸多态性的排序可能不同。