Shi Gang, Simino Jeannette, Rao Dabeeru C
Division of Biostatistics, Washington University School of Medicine, 660 South Euclid Avenue, St, Louis, MO 63110, USA.
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S82. doi: 10.1186/1753-6561-5-S9-S82.
Genome-wide association studies have been successful in identifying common variants for common complex traits in recent years. However, common variants have generally failed to explain substantial proportions of the trait heritabilities. Rare variants, structural variations, and gene-gene and gene-environment interactions, among others, have been suggested as potential sources of the so-called missing heritability. With the advent of exome-wide and whole-genome next-generation sequencing technologies, finding rare variants in functionally important sites (e.g., protein-coding regions) becomes feasible. We investigate the role of linkage information to select families enriched for rare variants using the simulated Genetic Analysis Workshop 17 data. In each replicate of simulated phenotypes Q1 and Q2 on 697 subjects in 8 extended pedigrees, we select one pedigree with the largest family-specific LOD score. Across all 200 replications, we compare the probability that rare causal alleles will be carried in the selected pedigree versus a randomly chosen pedigree. One example of successful enrichment was exhibited for gene VEGFC. The causal variant had minor allele frequency of 0.0717% in the simulated unrelated individuals and explained about 0.1% of the phenotypic variance. However, it explained 7.9% of the phenotypic variance in the eight simulated pedigrees and 23.8% in the family that carried the minor allele. The carrier's family was selected in all 200 replications. Thus our results show that family-specific linkage information is useful for selecting families for sequencing, thus ensuring that rare functional variants are segregating in the sequencing samples.
近年来,全基因组关联研究已成功识别出常见复杂性状的常见变异。然而,常见变异通常无法解释相当比例的性状遗传力。除其他因素外,罕见变异、结构变异以及基因-基因和基因-环境相互作用等被认为是所谓“缺失遗传力”的潜在来源。随着外显子组范围和全基因组下一代测序技术的出现,在功能重要位点(如蛋白质编码区)发现罕见变异变得可行。我们利用模拟的遗传分析研讨会17数据,研究连锁信息在选择富含罕见变异的家系中的作用。在8个扩展家系中697名受试者的模拟表型Q1和Q2的每次重复中,我们选择一个具有最大家系特异性LOD得分的家系。在所有200次重复中,我们比较所选家系与随机选择家系中携带罕见因果等位基因的概率。基因VEGFC展示了一个成功富集的例子。在模拟的非亲属个体中,因果变异的次要等位基因频率为0.0717%,解释了约0.1%的表型变异。然而,在8个模拟家系中它解释了7.9%的表型变异,在携带次要等位基因的家系中解释了23.8%的表型变异。在所有200次重复中都选择了携带者的家系。因此,我们的结果表明,家系特异性连锁信息有助于选择用于测序的家系,从而确保罕见功能变异在测序样本中分离。