Suppr超能文献

在残基溶剂可及性的背景下对编码序列进化进行建模。

Modeling coding-sequence evolution within the context of residue solvent accessibility.

机构信息

Center for Computational Biology and Bioinformatics, Institute for Cellular and Molecular Biology, and Section of Integrative Biology, The University of Texas at Austin, Austin, TX 78712, USA.

出版信息

BMC Evol Biol. 2012 Sep 12;12:179. doi: 10.1186/1471-2148-12-179.

Abstract

BACKGROUND

Protein structure mediates site-specific patterns of sequence divergence. In particular, residues in the core of a protein (solvent-inaccessible residues) tend to be more evolutionarily conserved than residues on the surface (solvent-accessible residues).

RESULTS

Here, we present a model of sequence evolution that explicitly accounts for the relative solvent accessibility of each residue in a protein. Our model is a variant of the Goldman-Yang 1994 (GY94) model in which all model parameters can be functions of the relative solvent accessibility (RSA) of a residue. We apply this model to a data set comprised of nearly 600 yeast genes, and find that an evolutionary-rate ratio ω that varies linearly with RSA provides a better model fit than an RSA-independent ω or an ω that is estimated separately in individual RSA bins. We further show that the branch length t and the transition-transverion ratio κ also vary with RSA. The RSA-dependent GY94 model performs better than an RSA-dependent Muse-Gaut 1994 (MG94) model in which the synonymous and non-synonymous rates individually are linear functions of RSA. Finally, protein core size affects the slope of the linear relationship between ω and RSA, and gene expression level affects both the intercept and the slope.

CONCLUSIONS

Structure-aware models of sequence evolution provide a significantly better fit than traditional models that neglect structure. The linear relationship between ω and RSA implies that genes are better characterized by their ω slope and intercept than by just their mean ω.

摘要

背景

蛋白质结构介导特定位置的序列差异。特别是,蛋白质核心(溶剂不可接近残基)中的残基比表面(溶剂可接近残基)中的残基更具进化保守性。

结果

在这里,我们提出了一种序列进化模型,该模型明确考虑了蛋白质中每个残基的相对溶剂可及性。我们的模型是 Goldman-Yang 1994(GY94)模型的变体,其中所有模型参数都可以是残基相对溶剂可及性(RSA)的函数。我们将该模型应用于由近 600 个酵母基因组成的数据集,并发现与 RSA 无关的 ω 或在单独的 RSA 箱中单独估计的 ω 相比,与 RSA 线性变化的进化率比 ω 提供了更好的模型拟合。我们进一步表明,分支长度 t 和转换-颠换比 κ 也随 RSA 而变化。与 RSA 相关的 GY94 模型在与 RSA 相关的 Muse-Gaut 1994(MG94)模型中的表现更好,其中同义和非同义率分别是 RSA 的线性函数。最后,蛋白质核心大小影响 ω 与 RSA 之间线性关系的斜率,而基因表达水平影响截距和斜率。

结论

与忽略结构的传统模型相比,具有结构意识的序列进化模型提供了更好的拟合。ω 与 RSA 之间的线性关系意味着基因可以通过 ω 斜率和截距而不是仅通过平均 ω 更好地描述。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b17d/3527230/b9d956053591/1471-2148-12-179-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验