Suppr超能文献

利用DNA序列多态性和分歧检测蛋白质编码基因内选择强度的区域变异

Detection of Regional Variation in Selection Intensity within Protein-Coding Genes Using DNA Sequence Polymorphism and Divergence.

作者信息

Zhao Zi-Ming, Campbell Michael C, Li Ning, Lee Daniel S W, Zhang Zhang, Townsend Jeffrey P

机构信息

Department of Biostatistics, Yale University, New Haven, CT.

Department of Biology, Howard University, Washington, DC.

出版信息

Mol Biol Evol. 2017 Nov 1;34(11):3006-3022. doi: 10.1093/molbev/msx213.

Abstract

Numerous approaches have been developed to infer natural selection based on the comparison of polymorphism within species and divergence between species. These methods are especially powerful for the detection of uniform selection operating across a gene. However, empirical analyses have demonstrated that regions of protein-coding genes exhibiting clusters of amino acid substitutions are subject to different levels of selection relative to other regions of the same gene. To quantify this heterogeneity of selection within coding sequences, we developed Model Averaged Site Selection via Poisson Random Field (MASS-PRF). MASS-PRF identifies an ensemble of intragenic clustering models for polymorphic and divergent sites. This ensemble of models is used within the Poisson Random Field framework to estimate selection intensity on a site-by-site basis. Using simulations, we demonstrate that MASS-PRF has high power to detect clusters of amino acid variants in small genic regions, can reliably estimate the probability of a variant occurring at each nucleotide site in sequence data and is robust to historical demographic trends and recombination. We applied MASS-PRF to human gene polymorphism derived from the 1,000 Genomes Project and divergence data from the common chimpanzee. On the basis of this analysis, we discovered striking regional variation in selection intensity, indicative of positive or negative selection, in well-defined domains of genes that have previously been associated with neurological processing, immunity, and reproduction. We suggest that amino acid-altering substitutions within these regions likely are or have been selectively advantageous in the human lineage, playing important roles in protein function.

摘要

基于物种内多态性与物种间分歧的比较,已经开发出了许多推断自然选择的方法。这些方法在检测跨基因的一致选择方面特别有效。然而,实证分析表明,与同一基因的其他区域相比,表现出氨基酸替换簇的蛋白质编码基因区域受到不同程度的选择。为了量化编码序列中这种选择的异质性,我们开发了基于泊松随机场的模型平均位点选择方法(MASS-PRF)。MASS-PRF为多态性和分歧位点识别一组基因内聚类模型。在泊松随机场框架内使用这组模型,逐位点估计选择强度。通过模拟,我们证明MASS-PRF具有很高的能力来检测小基因区域中的氨基酸变异簇,能够可靠地估计序列数据中每个核苷酸位点出现变异的概率,并且对历史人口趋势和重组具有稳健性。我们将MASS-PRF应用于来自千人基因组计划的人类基因多态性以及普通黑猩猩的分歧数据。基于这项分析,我们在先前与神经处理、免疫和生殖相关的基因的明确结构域中发现了选择强度的显著区域差异,表明存在正选择或负选择。我们认为这些区域内的氨基酸替换可能在人类谱系中是或曾经是选择性有利的,在蛋白质功能中发挥重要作用。

相似文献

本文引用的文献

6
Uncovering adaptive evolution in the human lineage.揭示人类谱系中的适应性进化。
BMC Genomics. 2014 Jul 16;15(1):599. doi: 10.1186/1471-2164-15-599.
8
Detecting natural selection in genomic data.检测基因组数据中的自然选择。
Annu Rev Genet. 2013;47:97-120. doi: 10.1146/annurev-genet-111212-133526.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验