Suppr超能文献

利用DNA序列多态性和分歧检测蛋白质编码基因内选择强度的区域变异

Detection of Regional Variation in Selection Intensity within Protein-Coding Genes Using DNA Sequence Polymorphism and Divergence.

作者信息

Zhao Zi-Ming, Campbell Michael C, Li Ning, Lee Daniel S W, Zhang Zhang, Townsend Jeffrey P

机构信息

Department of Biostatistics, Yale University, New Haven, CT.

Department of Biology, Howard University, Washington, DC.

出版信息

Mol Biol Evol. 2017 Nov 1;34(11):3006-3022. doi: 10.1093/molbev/msx213.

Abstract

Numerous approaches have been developed to infer natural selection based on the comparison of polymorphism within species and divergence between species. These methods are especially powerful for the detection of uniform selection operating across a gene. However, empirical analyses have demonstrated that regions of protein-coding genes exhibiting clusters of amino acid substitutions are subject to different levels of selection relative to other regions of the same gene. To quantify this heterogeneity of selection within coding sequences, we developed Model Averaged Site Selection via Poisson Random Field (MASS-PRF). MASS-PRF identifies an ensemble of intragenic clustering models for polymorphic and divergent sites. This ensemble of models is used within the Poisson Random Field framework to estimate selection intensity on a site-by-site basis. Using simulations, we demonstrate that MASS-PRF has high power to detect clusters of amino acid variants in small genic regions, can reliably estimate the probability of a variant occurring at each nucleotide site in sequence data and is robust to historical demographic trends and recombination. We applied MASS-PRF to human gene polymorphism derived from the 1,000 Genomes Project and divergence data from the common chimpanzee. On the basis of this analysis, we discovered striking regional variation in selection intensity, indicative of positive or negative selection, in well-defined domains of genes that have previously been associated with neurological processing, immunity, and reproduction. We suggest that amino acid-altering substitutions within these regions likely are or have been selectively advantageous in the human lineage, playing important roles in protein function.

摘要

基于物种内多态性与物种间分歧的比较,已经开发出了许多推断自然选择的方法。这些方法在检测跨基因的一致选择方面特别有效。然而,实证分析表明,与同一基因的其他区域相比,表现出氨基酸替换簇的蛋白质编码基因区域受到不同程度的选择。为了量化编码序列中这种选择的异质性,我们开发了基于泊松随机场的模型平均位点选择方法(MASS-PRF)。MASS-PRF为多态性和分歧位点识别一组基因内聚类模型。在泊松随机场框架内使用这组模型,逐位点估计选择强度。通过模拟,我们证明MASS-PRF具有很高的能力来检测小基因区域中的氨基酸变异簇,能够可靠地估计序列数据中每个核苷酸位点出现变异的概率,并且对历史人口趋势和重组具有稳健性。我们将MASS-PRF应用于来自千人基因组计划的人类基因多态性以及普通黑猩猩的分歧数据。基于这项分析,我们在先前与神经处理、免疫和生殖相关的基因的明确结构域中发现了选择强度的显著区域差异,表明存在正选择或负选择。我们认为这些区域内的氨基酸替换可能在人类谱系中是或曾经是选择性有利的,在蛋白质功能中发挥重要作用。

相似文献

2
Selection on human genes as revealed by comparisons to chimpanzee cDNA.
Genome Res. 2003 May;13(5):831-7. doi: 10.1101/gr.944903.
4
The evolution of lineage-specific clusters of single nucleotide substitutions in the human genome.
Mol Phylogenet Evol. 2013 Oct;69(1):276-85. doi: 10.1016/j.ympev.2013.06.003. Epub 2013 Jun 14.
5
Contributions of protein-coding and regulatory change to adaptive molecular evolution in murid rodents.
PLoS Genet. 2013;9(12):e1003995. doi: 10.1371/journal.pgen.1003995. Epub 2013 Dec 5.
6
Intragenic Hill-Robertson interference influences selection intensity on synonymous mutations in Drosophila.
Mol Biol Evol. 2005 Dec;22(12):2519-30. doi: 10.1093/molbev/msi246. Epub 2005 Aug 24.
7
Rapid detection of positive selection in genes and genomes through variation clusters.
Genetics. 2007 Aug;176(4):2451-63. doi: 10.1534/genetics.107.074732. Epub 2007 Jul 1.
8
Positive selection at sites of multiple amino acid replacements since rat-mouse divergence.
Nature. 2004 Jun 3;429(6991):558-62. doi: 10.1038/nature02601.
9
Lineage-specific differences in evolutionary mode in a salamander courtship pheromone.
Mol Biol Evol. 2005 Nov;22(11):2243-56. doi: 10.1093/molbev/msi219. Epub 2005 Jul 20.
10

引用本文的文献

2
Rapid divergence of a gamete recognition gene promoted macroevolution of Eutheria.
Genome Biol. 2022 Jul 11;23(1):155. doi: 10.1186/s13059-022-02721-y.
3
Neutral Theory and the Somatic Evolution of Cancer.
Mol Biol Evol. 2018 Jun 1;35(6):1308-1315. doi: 10.1093/molbev/msy079.

本文引用的文献

3
Recent coselection in human populations revealed by protein-protein interaction network.
Genome Biol Evol. 2014 Dec 21;7(1):136-53. doi: 10.1093/gbe/evu270.
4
The peopling of the African continent and the diaspora into the new world.
Curr Opin Genet Dev. 2014 Dec;29:120-32. doi: 10.1016/j.gde.2014.09.003.
5
Approximation to the distribution of fitness effects across functional categories in human segregating polymorphisms.
PLoS Genet. 2014 Nov 6;10(11):e1004697. doi: 10.1371/journal.pgen.1004697. eCollection 2014 Nov.
6
Uncovering adaptive evolution in the human lineage.
BMC Genomics. 2014 Jul 16;15(1):599. doi: 10.1186/1471-2164-15-599.
7
STAT2 and IRF9: Beyond ISGF3.
JAKSTAT. 2013 Oct 1;2(4):e27521. doi: 10.4161/jkst.27521. Epub 2013 Dec 18.
8
Detecting natural selection in genomic data.
Annu Rev Genet. 2013;47:97-120. doi: 10.1146/annurev-genet-111212-133526.
9
Protein engineering strategies for the development of viral vaccines and immunotherapeutics.
FEBS Lett. 2014 Jan 21;588(2):298-307. doi: 10.1016/j.febslet.2013.10.014. Epub 2013 Oct 21.
10
OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes.
Bioinformatics. 2013 Sep 15;29(18):2238-44. doi: 10.1093/bioinformatics/btt395. Epub 2013 Jul 24.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验