Suppr超能文献

祖先序列重建作为一种检测和研究新基因出现的工具。

Ancestral Sequence Reconstruction as a Tool to Detect and Study De Novo Gene Emergence.

机构信息

Institute for Fundamental Biomedical Research, BSRC "Alexander Fleming", Vari, Greece.

Pittsburgh Center for Evolutionary Biology and Medicine, Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA.

出版信息

Genome Biol Evol. 2024 Aug 5;16(8). doi: 10.1093/gbe/evae151.

Abstract

New protein-coding genes can evolve from previously noncoding genomic regions through a process known as de novo gene emergence. Evidence suggests that this process has likely occurred throughout evolution and across the tree of life. Yet, confidently identifying de novo emerged genes remains challenging. Ancestral sequence reconstruction is a promising approach for inferring whether a gene has emerged de novo or not, as it allows us to inspect whether a given genomic locus ancestrally harbored protein-coding capacity. However, the use of ancestral sequence reconstruction in the context of de novo emergence is still in its infancy and its capabilities, limitations, and overall potential are largely unknown. Notably, it is difficult to formally evaluate the protein-coding capacity of ancestral sequences, particularly when new gene candidates are short. How well-suited is ancestral sequence reconstruction as a tool for the detection and study of de novo genes? Here, we address this question by designing an ancestral sequence reconstruction workflow incorporating different tools and sets of parameters and by introducing a formal criterion that allows to estimate, within a desired level of confidence, when protein-coding capacity originated at a particular locus. Applying this workflow on ∼2,600 short, annotated budding yeast genes (<1,000 nucleotides), we found that ancestral sequence reconstruction robustly predicts an ancient origin for the most widely conserved genes, which constitute "easy" cases. For less robust cases, we calculated a randomization-based empirical P-value estimating whether the observed conservation between the extant and ancestral reading frame could be attributed to chance. This formal criterion allowed us to pinpoint a branch of origin for most of the less robust cases, identifying 49 genes that can unequivocally be considered de novo originated since the split of the Saccharomyces genus, including 37 Saccharomyces cerevisiae-specific genes. We find that for the remaining equivocal cases we cannot rule out different evolutionary scenarios including rapid evolution, multiple gene losses, or a recent de novo origin. Overall, our findings suggest that ancestral sequence reconstruction is a valuable tool to study de novo gene emergence but should be applied with caution and awareness of its limitations.

摘要

新的蛋白质编码基因可以通过一个称为从头基因出现的过程从先前非编码基因组区域进化而来。有证据表明,这一过程可能在整个进化过程中和生命之树上都发生过。然而,自信地识别新出现的基因仍然具有挑战性。祖先序列重建是一种很有前途的方法,可以推断一个基因是否是从头出现的,因为它允许我们检查给定的基因组位点在祖先是否具有编码能力。然而,在从头出现的背景下使用祖先序列重建仍然处于起步阶段,其能力、限制和总体潜力在很大程度上是未知的。值得注意的是,很难正式评估祖先序列的蛋白质编码能力,特别是当新的基因候选者很短时。祖先序列重建作为检测和研究从头基因的工具的适用性如何?在这里,我们通过设计一个包含不同工具和参数集的祖先序列重建工作流程,并引入一个正式的标准来解决这个问题,该标准允许在给定的置信水平内估计蛋白质编码能力在特定位置起源的时间。将这个工作流程应用于大约 2600 个短的、注释的芽殖酵母基因(<1000 个核苷酸),我们发现,祖先序列重建可以稳健地预测最广泛保守基因的古老起源,这些基因构成了“简单”的情况。对于不太稳健的情况,我们计算了一个基于随机化的经验 P 值,估计在现存和祖先阅读框之间观察到的保守性是否可以归因于偶然。这个正式的标准允许我们确定大多数不太稳健的情况的起源分支,确定了 49 个可以明确被认为是从头起源的基因,包括 37 个酿酒酵母特有的基因。我们发现,对于其余模棱两可的情况,我们不能排除不同的进化情景,包括快速进化、多个基因丢失或最近的从头起源。总的来说,我们的发现表明,祖先序列重建是研究从头基因出现的一种有价值的工具,但应该谨慎应用,并意识到其局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ee2/11299112/fecfc297cb64/evae151f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验