Akula Nirmala, Detera-Wadleigh Sevilla, Shugart Yin Yao, Nalls Michael, Steele Jo, McMahon Francis J
Mood and Anxiety Section, Human Genetics Branch, National Institute of Mental Health, National Institutes of Health, 35 Convent Drive, Bethesda, MD 20892, USA.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S76. doi: 10.1186/1753-6561-5-S9-S76.
Large-scale, deep resequencing may be the next logical step in the genetic investigation of common complex diseases. Because each individual is likely to carry many thousands of variants, the identification of causal alleles requires an efficient strategy to reduce the number of candidate variants. Under many genetic models, causal alleles can be expected to reside within identity-by-descent (IBD) regions shared by affected relatives. In distant relatives, IBD regions constitute a small portion of the genome and can thus greatly reduce the search space for causal alleles. However, the effectiveness of this strategy is unknown. We test the simulated mini-exome data set in extended pedigrees provided by Genetic Analysis Workshop 17. At the fourth- and fifth-degree level of relatedness, case-case pairs shared between 1% and 9% of the genome identical by descent. As expected, no genes were shared identical by descent by all case subjects, but 43 genes were shared by many case subjects across at least 50 replicates. We filtered variants in these genes based on population frequency, function, informativeness, and evidence of association using the family-based association test. This analysis highlighted five genes previously implicated in triglyceride, lipid, and cholesterol metabolism. Comparison with the list of true risk alleles revealed that strict IBD filtering followed by association testing of the rarest alleles was the most sensitive strategy. IBD filtering may be a useful strategy for narrowing down the list of candidate variants in exome data, but the optimal degree of relatedness of affected pairs will depend on the genetic architecture of the disease under study.
大规模深度重测序可能是常见复杂疾病基因研究的下一个合理步骤。由于每个人可能携带数千种变异,确定致病等位基因需要一种有效的策略来减少候选变异的数量。在许多遗传模型下,可以预期致病等位基因存在于患病亲属共有的同源片段(IBD)中。在远亲中,IBD区域在基因组中占一小部分,因此可以大大减少致病等位基因的搜索空间。然而,这种策略的有效性尚不清楚。我们对遗传分析研讨会17提供的扩展家系中的模拟小外显子数据集进行了测试。在四、五级亲属关系水平上,病例对之间有1%至9%的基因组是同源的。正如预期的那样,所有病例受试者之间没有共享同源的基因,但在至少50次重复中,许多病例受试者共享了43个基因。我们使用基于家系的关联测试,根据群体频率、功能、信息性和关联证据对这些基因中的变异进行了筛选。该分析突出了五个先前与甘油三酯、脂质和胆固醇代谢有关的基因。与真实风险等位基因列表的比较表明,严格的IBD筛选,然后对最罕见的等位基因进行关联测试是最敏感的策略。IBD筛选可能是缩小外显子数据中候选变异列表的有用策略,但患病对的最佳亲属关系程度将取决于所研究疾病的遗传结构。