Gulko Brad, Hubisz Melissa J, Gronau Ilan, Siepel Adam
Graduate Field of Computer Science, Cornell University, Ithaca, New York, USA.
Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, USA.
Nat Genet. 2015 Mar;47(3):276-83. doi: 10.1038/ng.3196. Epub 2015 Jan 19.
We describe a new computational method for estimating the probability that a point mutation at each position in a genome will influence fitness. These 'fitness consequence' (fitCons) scores serve as evolution-based measures of potential genomic function. Our approach is to cluster genomic positions into groups exhibiting distinct 'fingerprints' on the basis of high-throughput functional genomic data, then to estimate a probability of fitness consequences for each group from associated patterns of genetic polymorphism and divergence. We have generated fitCons scores for three human cell types on the basis of public data from ENCODE. In comparison with conventional conservation scores, fitCons scores show considerably improved prediction power for cis regulatory elements. In addition, fitCons scores indicate that 4.2-7.5% of nucleotides in the human genome have influenced fitness since the human-chimpanzee divergence, and they suggest that recent evolutionary turnover has had limited impact on the functional content of the genome.
我们描述了一种新的计算方法,用于估计基因组中每个位置的点突变影响适应性的概率。这些“适应性后果”(fitCons)分数作为基于进化的潜在基因组功能度量。我们的方法是根据高通量功能基因组数据将基因组位置聚类为表现出不同“指纹”的组,然后根据相关的遗传多态性和分化模式估计每组适应性后果的概率。我们基于ENCODE的公开数据生成了三种人类细胞类型的fitCons分数。与传统的保守分数相比,fitCons分数对顺式调控元件的预测能力有显著提高。此外,fitCons分数表明,自人类与黑猩猩分化以来,人类基因组中4.2%-7.5%的核苷酸影响了适应性,并且它们表明最近的进化更替对基因组的功能内容影响有限。