Suppr超能文献

酵母属中从头产生蛋白质编码基因的推测情景。

A putative scenario of how de novo protein-coding genes originate in the Saccharomyces cerevisiae lineage.

机构信息

Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Fukuoka, Japan.

Mitsubishi Research Institute, Inc., Tokyo, Japan.

出版信息

BMC Genomics. 2024 Sep 5;25(Suppl 3):834. doi: 10.1186/s12864-024-10669-5.

Abstract

BACKGROUND

Novel protein-coding genes were considered to be born by re-organization of pre-existing genes, such as gene duplication and gene fusion. However, recent progress of genome research revealed that more protein-coding genes than expected were born de novo, that is, gene origination by accumulating mutations in non-genic DNA sequences. Nonetheless, the in-depth process (scenario) for de novo origination is not well understood.

RESULTS

We have conceived bioinformatic analysis for sketching a scenario for de novo origination of protein-coding genes. For each de novo protein-coding gene, we firstly identified an edge of a given phylogenetic tree where the gene was born based on parsimony. Then, from a multiple sequence alignment of the de novo gene and its orthologous regions, we constructed ancestral DNA sequences of the gene corresponding to both end nodes of the edge. We finally revealed statistical features observed in evolution between the two ancestral sequences. In the analysis of the Saccharomyces cerevisiae lineage, we have successfully sketched a putative scenario for de novo origination of protein-coding genes. (1) In the beginning was GC-rich genome regions. (2) Neutral mutations were accumulated in the regions. (3) ORFs were extended/combined, and then (4) translation signature (Kozak consensus sequence) was recruited. Interestingly, as the scenario progresses from (2) to (4), the specificity of mutations increases.

CONCLUSION

To the best of our knowledge, this is the first report outlining a scenario of de novo origination of protein-coding genes. Our bioinformatic analysis can capture events that occur during a short evolutionary time by directly observing the evolution of the ancestral sequences from non-genic to genic. This property is suitable for the analysis of fast evolving de novo genes.

摘要

背景

新的蛋白编码基因被认为是通过重新组织现有的基因而产生的,例如基因复制和基因融合。然而,最近的基因组研究进展表明,比预期更多的蛋白编码基因是从头产生的,也就是说,通过在非基因 DNA 序列中积累突变而产生基因。尽管如此,从头产生的深入过程(情景)还不是很清楚。

结果

我们已经设想了一种生物信息学分析方法,用于描绘蛋白编码基因从头产生的情景。对于每个从头产生的蛋白编码基因,我们首先根据简约性确定基因在给定系统发育树上产生的边缘。然后,从从头基因与其直系同源区域的多重序列比对中,我们构建了对应于边缘两个末端节点的基因的祖先 DNA 序列。最后,我们揭示了在基因的两个祖先序列之间观察到的进化中的统计特征。在对酿酒酵母谱系的分析中,我们成功地描绘了蛋白编码基因从头产生的假设情景。(1)最初是富含 GC 的基因组区域。(2)中性突变在这些区域积累。(3)ORFs 扩展/组合,然后(4)招募翻译特征(Kozak 一致序列)。有趣的是,随着情景从(2)发展到(4),突变的特异性增加。

结论

据我们所知,这是首次概述蛋白编码基因从头产生情景的报告。我们的生物信息学分析可以通过直接观察从非基因到基因的祖先序列的进化,捕捉在短时间内发生的事件。这一特性适合于对快速进化的从头基因进行分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ce4/11378370/aa47a08f5c10/12864_2024_10669_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验