Zhu Yanglong, Pulukkunat Dileep K, Li Yong
Department of Biochemistry and Molecular Biology, and Center for Genetics and Molecular Medicine, School of Medicine, University of Louisville, Louisville, KY 40202, USA.
Nucleic Acids Res. 2007;35(7):2283-94. doi: 10.1093/nar/gkm057. Epub 2007 Mar 27.
Metagenomics has been employed to systematically sequence, classify, analyze and manipulate the entire genetic material isolated from environmental samples. Finding genes within metagenomic sequences remains a formidable challenge, and noncoding RNA genes other than those encoding rRNA and tRNA are not well annotated in metagenomic projects. In this work, we identify, validate and analyze the genes coding for RNase P RNA (P RNA) from all published metagenomic projects. P RNA is the RNA subunit of a ubiquitous endoribonuclease RNase P that consists of one RNA subunit and one or more protein subunits. The bacterial P RNAs are classified into two types, Type A and Type B, based on the constituents of the structure involved in precursor tRNA binding. Archaeal P RNAs are classified into Type A and Type M, whereas the Type A is ancestral and close to Type A bacterial P RNA. Bacterial and some archaeal P RNAs are catalytically active without protein subunits, capable of cleaving precursor tRNA transcripts to produce their mature 5'-termini. We have found 328 distinctive P RNAs (320 bacterial and 8 archaeal) from all published metagenomics sequences, which led us to expand by 60% the total number of this catalytic RNA from prokaryotes. Surprisingly, all newly identified P RNAs from metagenomics sequences are Type A, i.e. neither Type B bacterial nor Type M archaeal P RNAs are found. We experimentally validate the authenticity of an archaeal P RNA from Sargasso Sea. One of the distinctive features of some new P RNAs is that the P2 stem has kinked nucleotides in its 5' strand. We find that the single nucleotide J2/3 joint region linking the P2 and P3 stem that was used to distinguish a bacterial P RNA from an archaeal one is no longer applicable, i.e. some archaeal P RNAs have only one nucleotide in the J2/3 joint. We also discuss the phylogenetic analysis based on covariance model of P RNA that offers a few advantages over the one based on 16S rRNA.
宏基因组学已被用于对从环境样本中分离出的全部遗传物质进行系统测序、分类、分析和操作。在宏基因组序列中寻找基因仍然是一项艰巨的挑战,并且除了编码rRNA和tRNA的基因外,其他非编码RNA基因在宏基因组项目中并未得到很好的注释。在这项工作中,我们从所有已发表的宏基因组项目中鉴定、验证并分析了编码核糖核酸酶P RNA(P RNA)的基因。P RNA是普遍存在的核糖核酸内切酶核糖核酸酶P的RNA亚基,它由一个RNA亚基和一个或多个蛋白质亚基组成。基于参与前体tRNA结合的结构成分,细菌P RNA可分为A 型和B型。古菌P RNA分为A 型和M型,而A 型是祖先型,与细菌A 型P RNA相近。细菌和一些古菌的P RNA在没有蛋白质亚基的情况下具有催化活性,能够切割前体tRNA转录本以产生其成熟的5'末端。我们从所有已发表的宏基因组序列中发现了328种独特的P RNA(320种细菌的和8种古菌的),这使原核生物中这种催化RNA的总数增加了60%。令人惊讶的是,从宏基因组序列中新鉴定出的所有P RNA都是A 型,即未发现B型细菌或M型古菌的P RNA。我们通过实验验证了来自马尾藻海的一种古菌P RNA的真实性。一些新的P RNA的一个显著特征是P2茎在其5'链中有扭结核苷酸。我们发现,用于区分细菌P RNA和古菌P RNA的连接P2和P3茎的单核苷酸J2/3连接区域不再适用,即一些古菌P RNA在J2/3连接中只有一个核苷酸。我们还讨论了基于P RNA协方差模型的系统发育分析,该分析比基于16S rRNA的分析具有一些优势。