Suppr超能文献

Pfam 结构域的差异保留导致长期进化趋势。

Differential Retention of Pfam Domains Contributes to Long-term Evolutionary Trends.

机构信息

Department of Ecology & Evolutionary Biology, University of Arizona, Tucson, AZ.

Department of Ecology and Genetics, Uppsala University, Uppsala, Sweden.

出版信息

Mol Biol Evol. 2023 Apr 4;40(4). doi: 10.1093/molbev/msad073.

Abstract

Protein domains that emerged more recently in evolution have a higher structural disorder and greater clustering of hydrophobic residues along the primary sequence. It is hard to explain how selection acting via descent with modification could act so slowly as not to saturate over the extraordinarily long timescales over which these trends persist. Here, we hypothesize that the trends were created by a higher level of selection that differentially affects the retention probabilities of protein domains with different properties. This hypothesis predicts that loss rates should depend on disorder and clustering trait values. To test this, we inferred loss rates via maximum likelihood for animal Pfam domains, after first performing a set of stringent quality control methods to reduce annotation errors. Intermediate trait values, matching those of ancient domains, are associated with the lowest loss rates, making our results difficult to explain with reference to previously described homology detection biases. Simulations confirm that effect sizes are of the right magnitude to produce the observed long-term trends. Our results support the hypothesis that differential domain loss slowly weeds out those protein domains that have nonoptimal levels of disorder and clustering. The same preferences also shape the differential diversification of Pfam domains, thereby further impacting proteome composition.

摘要

在进化中出现得较晚的蛋白质结构域具有更高的结构无序性和更大的疏水性残基在一级序列上的聚集。很难解释通过遗传漂变进行的选择怎么能如此缓慢地发挥作用,以至于不能在这些趋势持续的非常长的时间尺度上达到饱和。在这里,我们假设这些趋势是由一种更高层次的选择造成的,这种选择对具有不同特性的蛋白质结构域的保留概率有不同的影响。这一假说预测,丢失率应该取决于无序性和聚类特征值。为了验证这一点,我们首先通过最大似然法推断了动物 Pfam 结构域的丢失率,然后执行了一组严格的质量控制方法来减少注释错误。中间特征值与古老结构域的特征值相匹配,与最低的丢失率相关,这使得我们的结果很难用先前描述的同源检测偏差来解释。模拟证实,效应大小的量级足以产生观察到的长期趋势。我们的结果支持这样一种假说,即差异的结构域丢失缓慢地淘汰了那些具有非最优无序性和聚类水平的蛋白质结构域。相同的偏好也会影响 Pfam 结构域的差异多样化,从而进一步影响蛋白质组的组成。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e5f/10089649/6538f38fe0cd/msad073f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验