Ge Xiaoyan, Kwok Pui-Yan, Shieh Joseph T C
Division of Medical Genetics, Department of Pediatrics, University of California San Francisco, San Francisco, CA 94143, USA Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94143, USA.
Institute for Human Genetics, University of California San Francisco, San Francisco, CA 94143, USA Department of Dermatology, University of California San Francisco, San Francisco, CA 94143, USA and Cardiovascular Research Institute, University of California San Francisco, San Francisco, CA 94143, USA.
Hum Mol Genet. 2015 Feb 1;24(3):599-608. doi: 10.1093/hmg/ddu473. Epub 2014 Sep 12.
Many new disease genes can be identified through high-throughput sequencing. Yet, variant interpretation for the large amounts of genomic data remains a challenge given variation of uncertain significance and genes that lack disease annotation. As clinically significant disease genes may be subject to negative selection, we developed a prediction method that measures paucity of non-synonymous variation in the human population to infer gene-based pathogenicity. Integrating human exome data of over 6000 individuals from the NHLBI Exome Sequencing Project, we tested the utility of the prediction method based on the ratio of non-synonymous to synonymous substitution rates (dN/dS) on X-chromosome genes. A low dN/dS ratio characterized genes associated with childhood disease and outcome. Furthermore, we identify new candidates for diseases with early mortality and demonstrate intragenic localized patterns of variants that suggest pathogenic hotspots. Our results suggest that intrahuman substitution analysis is a valuable tool to help prioritize novel disease genes in sequence interpretation.
通过高通量测序可以鉴定出许多新的疾病基因。然而,鉴于意义不确定的变异以及缺乏疾病注释的基因,对大量基因组数据进行变异解读仍然是一项挑战。由于具有临床意义的疾病基因可能受到负选择,我们开发了一种预测方法,该方法通过衡量人类群体中非同义变异的稀缺性来推断基于基因的致病性。整合来自美国国立心肺血液研究所外显子测序项目的6000多名个体的人类外显子数据,我们基于X染色体基因上非同义与同义替换率的比值(dN/dS)测试了该预测方法的效用。低dN/dS比值是与儿童疾病及预后相关基因的特征。此外,我们确定了早发性死亡疾病的新候选基因,并展示了表明致病热点的基因内局部变异模式。我们的结果表明,人类内部替换分析是在序列解读中帮助确定新型疾病基因优先级的有价值工具。