Suppr超能文献

SnoBIRD:一种用于识别C/D盒小核仁RNA并完善其在所有真核生物中的注释的工具。

SnoBIRD: a tool to identify C/D box snoRNAs and refine their annotation across all eukaryotes.

作者信息

Fafard-Couture Étienne, Boulanger Cédric, Faucher-Giguère Laurence, Sinagoga Vanessa, Berthoumieux Mélodie, Hedjam Jordan, Marcel Virginie, Durand Sébastien, Bayfield Mark A, Bachand François, Abou Elela Sherif, Jacques Pierre-Étienne, Scott Michelle S

机构信息

Département de biochimie et de génomique fonctionnelle, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, Québec J1E 4K8, Canada.

Centre de recherche du Centre hospitalier universitaire de Sherbrooke (CRCHUS), Sherbrooke, Québec J1H 5N3, Canada.

出版信息

Nucleic Acids Res. 2025 Jul 19;53(14). doi: 10.1093/nar/gkaf708.

Abstract

Small nucleolar RNAs (snoRNAs), a group of noncoding RNAs present amongst all eukaryotes, are most extensively characterized for their regulation of ribosome biogenesis and splicing. Despite their central roles, current snoRNA annotations remain incomplete. Several eukaryote genome annotations contain few or no snoRNAs, and none distinguish expressed snoRNAs from their pseudogenes-a recently characterized snoRNA subclass with distinct features and expression levels. To address this, we developed SnoBIRD, a BERT-based C/D box snoRNA predictor trained on snoRNAs spanning all eukaryote kingdoms. We show that SnoBIRD outperforms existing tools and is the only predictor capable of identifying snoRNA pseudogenes using biologically relevant signal. Applied on the fission yeast and human genomes, we demonstrate that only SnoBIRD scales well with genome size in terms of runtime, and we identify and experimentally validate several new SnoBIRD-predicted C/D box snoRNAs. By running SnoBIRD on multiple eukaryote genomes, we identify hundreds of novel snoRNA candidates and highlight SnoBIRD's usefulness to determine the evolutionary paths of snoRNAs distributed across different species. Overall, SnoBIRD represents a user-friendly and efficient tool for reliably predicting C/D box snoRNAs and their pseudogenes across any eukaryote genome.

摘要

小核仁RNA(snoRNA)是一类存在于所有真核生物中的非编码RNA,其在核糖体生物合成和剪接调控方面的特征最为广泛。尽管它们起着核心作用,但目前的snoRNA注释仍然不完整。一些真核生物基因组注释包含很少或没有snoRNA,而且没有一个能将表达的snoRNA与其假基因区分开来——假基因是最近鉴定出的具有独特特征和表达水平的snoRNA亚类。为了解决这个问题,我们开发了SnoBIRD,这是一种基于BERT的C/D盒snoRNA预测器,它在跨越所有真核生物界的snoRNA上进行训练。我们表明,SnoBIRD优于现有工具,并且是唯一能够使用生物学相关信号识别snoRNA假基因的预测器。应用于裂殖酵母和人类基因组,我们证明只有SnoBIRD在运行时能很好地适应基因组大小,并且我们鉴定并通过实验验证了几个新的SnoBIRD预测的C/D盒snoRNA。通过在多个真核生物基因组上运行SnoBIRD,我们鉴定出数百个新的snoRNA候选物,并强调了SnoBIRD在确定分布于不同物种的snoRNA进化路径方面的有用性。总体而言,SnoBIRD是一种用户友好且高效的工具,可用于可靠地预测任何真核生物基因组中的C/D盒snoRNA及其假基因。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63c9/12309372/abd23e8d3d7c/gkaf708figgra1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验