Morgado Sergio, Antunes Deborah, Caffarena Ernesto, Vicente Ana Carolina
Laboratory of Molecular Genetics of Microorganisms, Oswaldo Cruz Institute (IOC - FIOCRUZ) , Rio de Janeiro, Brazil.
Laboratory of Functional Genomics and Bioinformatics, Oswaldo Cruz Institute (IOC - FIOCRUZ) , Rio de Janeiro, Brazil.
RNA Biol. 2020 Jul;17(7):1001-1008. doi: 10.1080/15476286.2020.1748922. Epub 2020 Apr 22.
Noncoding RNA (ncRNA) genes produce transcripts involved in a wide range of functions, including catalytic and regulatory functions. Besides, some transcripts have highly complex structures that may impact their activities. Among the largest bacterial ncRNAs, there is the rare GOLLD RNA, which is associated with tRNA genes and supposed to be chromosome- and phage-encoded in specialized groups of bacteria, including those from and orders. The only GOLLD structure was inferred from a variety of sequences, including many marine metagenomes. To explore GOLLD RNA in bacterial genomes, we mined the GOLLD gene in thousands of and virus genomes using Infernal software. We identified this gene in 350 mycobacteria, including megaplasmids, and 39 bacteriophages, mainly in the genomic context of tRNA arrays. GOLLD genes presented a high diversity and were distributed in three phylogenetic groups: (i) exclusive; (ii) and mycobacteriophages; and (iii) mycobacteriophage exclusive. We also determined the GOLLD secondary structure of each group using R2 R software based on GOLLD alignments generated by Infernal software. All GOLLD groups displayed a 3' half conserved structure, including utter E-loops pseudoknots substructures, also shared by non GOLLD while the 5' half motif was different among the groups. Here, we showed that the lncRNA GOLLD is widespread in within tRNA arrays and corroborated the previously predicted GOLLD secondary structure.
非编码RNA(ncRNA)基因产生的转录本参与多种功能,包括催化和调节功能。此外,一些转录本具有高度复杂的结构,可能会影响其活性。在最大的细菌ncRNA中,有一种罕见的GOLLD RNA,它与tRNA基因相关,并且在包括来自特定目和纲的细菌的特殊细菌群体中被认为是由染色体和噬菌体编码的。唯一的GOLLD结构是从包括许多海洋宏基因组在内的各种序列中推断出来的。为了探索细菌基因组中的GOLLD RNA,我们使用Infernal软件在数千个细菌和病毒基因组中挖掘GOLLD基因。我们在350种分枝杆菌(包括巨型质粒)和39种噬菌体中鉴定到了这个基因,主要存在于tRNA阵列的基因组背景中。GOLLD基因呈现出高度的多样性,并分布在三个系统发育组中:(i)仅存在于特定细菌中;(ii)存在于特定细菌和分枝杆菌噬菌体中;(iii)仅存在于分枝杆菌噬菌体中。我们还基于Infernal软件生成的GOLLD比对,使用R2R软件确定了每组的GOLLD二级结构。所有GOLLD组都显示出3'端半保守结构,包括完全的E环假结子结构,非GOLLD也共享该结构,而5'端半基序在各组之间有所不同。在这里,我们表明lncRNA GOLLD在tRNA阵列中的特定细菌中广泛存在,并证实了先前预测的GOLLD二级结构。