Suppr超能文献

在酵母模型进化枝中评估罕见氨基酸替换(RGC_CAMs)。

Evaluating rare amino acid substitutions (RGC_CAMs) in a yeast model clade.

作者信息

Polzin Kenneth, Rokas Antonis

机构信息

Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America.

出版信息

PLoS One. 2014 Mar 17;9(3):e92213. doi: 10.1371/journal.pone.0092213. eCollection 2014.

Abstract

When inferring phylogenetic relationships, not all sites in a sequence alignment are equally informative. One recently proposed approach that takes advantage of this inequality relies on sites that contain amino acids whose replacement requires multiple substitutions. Identifying these so-called RGC_CAM substitutions (after Rare Genomic Changes as Conserved Amino acids-Multiple substitutions) requires that, first, at any given site in the amino acid sequence alignment, there must be a minimum of two different amino acids; second, each amino acid must be present in at least two taxa; and third, the amino acids must require a minimum of two nucleotide substitutions to replace each other. Although theory suggests that RGC_CAM substitutions are expected to be rare and less likely to be homoplastic, the informativeness of RGC_CAM substitutions has not been extensively evaluated in biological data sets. We investigated the quality of RGC_CAM substitutions by examining their degree of homoplasy and internode certainty in nearly 2.7 million aligned amino acid sites from 5,261 proteins from five species belonging to the yeast Saccharomyces sensu stricto clade whose phylogeny is well-established. We identified 2,647 sites containing RGC_CAM substitutions, a number that contrasts sharply with the 100,887 sites containing RGC_non-CAM substitutions (i.e., changes between amino acids that require only a single nucleotide substitution). We found that RGC_CAM substitutions had significantly lower homoplasy than RGC_non-CAM ones; specifically RGC_CAM substitutions showed a per-site average homoplasy index of 0.100, whereas RGC_non-CAM substitutions had a homoplasy index of 0.215. Internode certainty values were also higher for sites containing RGC_CAM substitutions than for RGC_non-CAM ones. These results suggest that RGC_CAM substitutions possess a strong phylogenetic signal and are useful markers for phylogenetic inference despite their rarity.

摘要

在推断系统发育关系时,序列比对中的并非所有位点都具有同等的信息量。最近提出的一种利用这种不平等性的方法依赖于那些包含氨基酸替换需要多个替代的位点。识别这些所谓的RGC_CAM替换(以稀有基因组变化作为保守氨基酸 - 多个替代之后)要求,首先,在氨基酸序列比对中的任何给定位点,必须至少有两种不同的氨基酸;其次,每种氨基酸必须至少存在于两个分类单元中;第三,这些氨基酸相互替换必须至少需要两个核苷酸替代。尽管理论表明RGC_CAM替换预计很少见且不太可能是同塑性的,但RGC_CAM替换的信息量尚未在生物数据集中得到广泛评估。我们通过检查来自酿酒酵母狭义进化枝的五个物种的5261种蛋白质的近270万个比对氨基酸位点中的同塑性程度和节点确定性,研究了RGC_CAM替换的质量。我们鉴定出2647个包含RGC_CAM替换的位点,这个数字与包含RGC_非CAM替换(即仅需要单个核苷酸替代的氨基酸之间的变化)的100887个位点形成鲜明对比。我们发现RGC_CAM替换的同塑性明显低于RGC_非CAM替换;具体而言,RGC_CAM替换显示每个位点的平均同塑性指数为0.100,而RGC_非CAM替换的同塑性指数为0.215。包含RGC_CAM替换的位点的节点确定性值也高于RGC_非CAM替换的位点。这些结果表明,尽管RGC_CAM替换很少见,但它们具有很强的系统发育信号,是系统发育推断的有用标记。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d619/3956930/c366eeddc8f6/pone.0092213.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验