Anderson Christopher D, Nalls Michael A, Biffi Alessandro, Rost Natalia S, Greenberg Steven M, Singleton Andrew B, Meschia James F, Rosand Jonathan
Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA.
Circ Cardiovasc Genet. 2011 Apr;4(2):188-96. doi: 10.1161/CIRCGENETICS.110.957928. Epub 2011 Feb 3.
Survival bias is the phenomenon by which individuals are excluded from analysis of a trait because of mortality related to the expression of that trait. In genetic association studies, variants increasing risk for disease onset as well as risk of disease-related mortality (lethality) could be difficult to detect in genetic association case-control designs, possibly leading to underestimation of a variant's effect on disease risk.
We modeled cohorts for 3 diseases of high lethality (intracerebral hemorrhage, ischemic stroke, and myocardial infarction) using existing longitudinal data. Based on these models, we simulated case-control genetic association studies for genetic risk factors of varying effect sizes, lethality, and minor allele frequencies. For each disease, erosion of detected effect size was larger for case-control studies of individuals of advanced age (age >75 years) and/or variants with very high event-associated lethality (genotype relative risk for event-related death >2.0). We found that survival bias results in no more than 20% effect size erosion for cohorts with mean age <75 years, even for variants that double lethality risk. Furthermore, we found that increasing effect size erosion was accompanied by depletion of minor allele frequencies in the case population, yielding a "signature" of the presence of survival bias.
Our simulation provides formulas to allow estimation of effect size erosion given a variant's odds ratio of disease, odds ratio of lethality, and minor allele frequencies. These formulas will add precision to power calculation and replication efforts for case-control genetic studies. Our approach requires validation using prospective data.
生存偏差是指个体由于与某一性状表达相关的死亡率而被排除在该性状分析之外的现象。在基因关联研究中,增加疾病发病风险以及疾病相关死亡率(致死率)风险的变异可能难以在基因关联病例对照设计中被检测到,这可能导致对变异对疾病风险影响的低估。
我们利用现有的纵向数据为三种高致死性疾病(脑出血、缺血性中风和心肌梗死)建立队列模型。基于这些模型,我们模拟了针对不同效应大小、致死率和次要等位基因频率的基因风险因素的病例对照基因关联研究。对于每种疾病,年龄较大个体(年龄>75岁)和/或具有非常高事件相关致死率的变异(事件相关死亡的基因型相对风险>2.0)的病例对照研究中,检测到的效应大小的衰减更大。我们发现,即使对于使致死风险加倍的变异,生存偏差导致平均年龄<75岁的队列中效应大小衰减不超过20%。此外,我们发现效应大小衰减的增加伴随着病例群体中次要等位基因频率的减少,从而产生生存偏差存在的“特征”。
我们的模拟提供了公式,可根据变异的疾病优势比、致死率优势比和次要等位基因频率来估计效应大小的衰减。这些公式将提高病例对照基因研究的功效计算和重复研究的精度。我们的方法需要使用前瞻性数据进行验证。