Weinberg Zasha, Perreault Jonathan, Meyer Michelle M, Breaker Ronald R
Howard Hughes Medical Institute, New Haven, Connecticut 06520-8103, USA.
Nature. 2009 Dec 3;462(7273):656-9. doi: 10.1038/nature08586.
Estimates of the total number of bacterial species indicate that existing DNA sequence databases carry only a tiny fraction of the total amount of DNA sequence space represented by this division of life. Indeed, environmental DNA samples have been shown to encode many previously unknown classes of proteins and RNAs. Bioinformatics searches of genomic DNA from bacteria commonly identify new noncoding RNAs (ncRNAs) such as riboswitches. In rare instances, RNAs that exhibit more extensive sequence and structural conservation across a wide range of bacteria are encountered. Given that large structured RNAs are known to carry out complex biochemical functions such as protein synthesis and RNA processing reactions, identifying more RNAs of great size and intricate structure is likely to reveal additional biochemical functions that can be achieved by RNA. We applied an updated computational pipeline to discover ncRNAs that rival the known large ribozymes in size and structural complexity or that are among the most abundant RNAs in bacteria that encode them. These RNAs would have been difficult or impossible to detect without examining environmental DNA sequences, indicating that numerous RNAs with extraordinary size, structural complexity, or other exceptional characteristics remain to be discovered in unexplored sequence space.
对细菌物种总数的估计表明,现有的DNA序列数据库仅涵盖了这一生命分类所代表的DNA序列空间总量的极小一部分。事实上,环境DNA样本已被证明能编码许多以前未知的蛋白质和RNA类别。对细菌基因组DNA进行生物信息学搜索通常会识别出新的非编码RNA(ncRNA),如核糖开关。在极少数情况下,会遇到在广泛的细菌中表现出更广泛序列和结构保守性的RNA。鉴于已知大型结构化RNA能执行复杂的生化功能,如蛋白质合成和RNA加工反应,识别更多具有大尺寸和复杂结构的RNA可能会揭示RNA所能实现的其他生化功能。我们应用了一种更新的计算流程来发现与已知大型核酶在大小和结构复杂性上相当或在编码它们的细菌中是最丰富的RNA之一的ncRNA。如果不检查环境DNA序列,这些RNA将很难或不可能被检测到,这表明在未探索的序列空间中仍有许多具有非凡大小、结构复杂性或其他特殊特征的RNA有待发现。