Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA.
Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA.
Am J Hum Genet. 2014 Sep 4;95(3):245-56. doi: 10.1016/j.ajhg.2014.08.004.
Recent and rapid human population growth has led to an excess of rare genetic variants that are expected to contribute to an individual's genetic burden of disease risk. To date, much of the focus has been on rare protein-coding variants, for which potential impact can be estimated from the genetic code, but determining the impact of rare noncoding variants has been more challenging. To improve our understanding of such variants, we combined high-quality genome sequencing and RNA sequencing data from a 17-individual, three-generation family to contrast expression quantitative trait loci (eQTLs) and splicing quantitative trait loci (sQTLs) within this family to eQTLs and sQTLs within a population sample. Using this design, we found that eQTLs and sQTLs with large effects in the family were enriched with rare regulatory and splicing variants (minor allele frequency < 0.01). They were also more likely to influence essential genes and genes involved in complex disease. In addition, we tested the capacity of diverse noncoding annotation to predict the impact of rare noncoding variants. We found that distance to the transcription start site, evolutionary constraint, and epigenetic annotation were considerably more informative for predicting the impact of rare variants than for predicting the impact of common variants. These results highlight that rare noncoding variants are important contributors to individual gene-expression profiles and further demonstrate a significant capability for genomic annotation to predict the impact of rare noncoding variants.
近年来,人类人口的快速增长导致了过多的罕见遗传变异,这些变异预计将导致个体疾病风险的遗传负担增加。迄今为止,研究的重点主要集中在罕见的蛋白质编码变异上,这些变异的潜在影响可以从遗传密码中估计出来,但确定罕见的非编码变异的影响更具挑战性。为了更好地理解这些变异,我们结合了来自一个 17 人的三代家族的高质量基因组测序和 RNA 测序数据,将该家族内的表达数量性状基因座 (eQTL) 和剪接数量性状基因座 (sQTL) 与人群样本内的 eQTL 和 sQTL 进行对比。通过这种设计,我们发现家族内具有较大影响的 eQTL 和 sQTL 富含罕见的调控和剪接变异(次要等位基因频率 < 0.01)。它们也更有可能影响必需基因和涉及复杂疾病的基因。此外,我们还测试了多种非编码注释来预测罕见非编码变异的影响。我们发现,与预测常见变异的影响相比,距离转录起始位点、进化约束和表观遗传注释更能有效地预测罕见变异的影响。这些结果强调了罕见的非编码变异是个体基因表达谱的重要贡献者,并进一步证明了基因组注释预测罕见非编码变异影响的显著能力。