Suppr超能文献

基因家族进化:非线性出生-死亡-创新模型的深入理论与模拟分析

Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models.

作者信息

Karev Georgy P, Wolf Yuri I, Berezovskaya Faina S, Koonin Eugene V

机构信息

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.

出版信息

BMC Evol Biol. 2004 Sep 9;4:32. doi: 10.1186/1471-2148-4-32.

Abstract

BACKGROUND

The size distribution of gene families in a broad range of genomes is well approximated by a generalized Pareto function. Evolution of ensembles of gene families can be described with Birth, Death, and Innovation Models (BDIMs). Analysis of the properties of different versions of BDIMs has the potential of revealing important features of genome evolution.

RESULTS

In this work, we extend our previous analysis of stochastic BDIMs. In addition to the previously examined rational BDIMs, we introduce potentially more realistic logistic BDIMs, in which birth/death rates are limited for the largest families, and show that their properties are similar to those of models that include no such limitation. We show that the mean time required for the formation of the largest gene families detected in eukaryotic genomes is limited by the mean number of duplications per gene and does not increase indefinitely with the model degree. Instead, this time reaches a minimum value, which corresponds to a non-linear rational BDIM with the degree of approximately 2.7. Even for this BDIM, the mean time of the largest family formation is orders of magnitude greater than any realistic estimates based on the timescale of life's evolution. We employed the embedding chains technique to estimate the expected number of elementary evolutionary events (gene duplications and deletions) preceding the formation of gene families of the observed size and found that the mean number of events exceeds the family size by orders of magnitude, suggesting a highly dynamic process of genome evolution. The variance of the time required for the formation of the largest families was found to be extremely large, with the coefficient of variation >> 1. This indicates that some gene families might grow much faster than the mean rate such that the minimal time required for family formation is more relevant for a realistic representation of genome evolution than the mean time. We determined this minimal time using Monte Carlo simulations of family growth from an ensemble of simultaneously evolving singletons. In these simulations, the time elapsed before the formation of the largest family was much shorter than the estimated mean time and was compatible with the timescale of evolution of eukaryotes.

CONCLUSIONS

The analysis of stochastic BDIMs presented here shows that non-linear versions of such models can well approximate not only the size distribution of gene families but also the dynamics of their formation during genome evolution. The fact that only higher degree BDIMs are compatible with the observed characteristics of genome evolution suggests that the growth of gene families is self-accelerating, which might reflect differential selective pressure acting on different genes.

摘要

背景

广义帕累托函数能很好地近似广泛基因组中基因家族的大小分布。基因家族集合的进化可用出生、死亡和创新模型(BDIMs)来描述。分析不同版本BDIMs的特性有可能揭示基因组进化的重要特征。

结果

在这项工作中,我们扩展了之前对随机BDIMs的分析。除了之前研究的有理BDIMs,我们引入了可能更现实的逻辑BDIMs,其中最大家族的出生/死亡率是有限的,并表明它们的特性与不包括这种限制的模型相似。我们表明,在真核基因组中检测到的最大基因家族形成所需的平均时间受每个基因的重复平均数限制,且不会随模型度数无限增加。相反,这个时间达到最小值,对应于度数约为2.7的非线性有理BDIM。即使对于这个BDIM,最大家族形成的平均时间也比基于生命进化时间尺度的任何现实估计大几个数量级。我们采用嵌入链技术来估计观察到大小的基因家族形成之前的基本进化事件(基因重复和缺失)的预期数量,发现事件的平均数量比家族大小大几个数量级,这表明基因组进化是一个高度动态的过程。发现最大家族形成所需时间的方差极大,变异系数>>1。这表明一些基因家族可能比平均速率增长得快得多,以至于家族形成所需的最短时间比平均时间更能真实地反映基因组进化。我们通过对同时进化的单基因集合的家族增长进行蒙特卡罗模拟来确定这个最短时间。在这些模拟中,最大家族形成之前经过的时间比估计的平均时间短得多,并且与真核生物的进化时间尺度相符。

结论

本文对随机BDIMs的分析表明,此类模型的非线性版本不仅能很好地近似基因家族的大小分布,还能近似其在基因组进化过程中的形成动态。只有更高度数的BDIMs与观察到的基因组进化特征相符这一事实表明,基因家族的增长是自我加速的,这可能反映了作用于不同基因的差异选择压力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b371/523855/36343609355d/1471-2148-4-32-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验