Wang Ming, Fleming Joy, Li Zihui, Li Chuanyou, Zhang Hongtai, Xue Yunxin, Chen Maoshan, Zhang Zongde, Zhang Xian-En, Bi Lijun
Key Laboratory of Non-Coding RNA & State Key Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China University of Chinese Academy of Sciences, Beijing 100049, China.
Key Laboratory of Non-Coding RNA & State Key Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
Acta Biochim Biophys Sin (Shanghai). 2016 Jun;48(6):544-53. doi: 10.1093/abbs/gmw037. Epub 2016 May 12.
Deep-sequencing of bacterial transcriptomes using RNA-Seq technology has made it possible to identify small non-coding RNAs, RNA molecules which regulate gene expression in response to changing environments, on a genome-wide scale in an ever-increasing range of prokaryotes. However, a simple and reliable automated method for identifying sRNA candidates in these large datasets is lacking. Here, after generating a transcriptome from an exponential phase culture of Mycobacterium tuberculosis H37Rv, we developed and validated an automated method for the genome-wide identification of sRNA candidate-containing regions within RNA-Seq datasets based on the analysis of the characteristics of reads coverage maps. We identified 192 novel candidate sRNA-encoding regions in intergenic regions and 664 RNA transcripts transcribed from regions antisense (as) to open reading frames (ORF), which bear the characteristics of asRNAs, and validated 28 of these novel sRNA-encoding regions by northern blotting. Our work has not only provided a simple automated method for genome-wide identification of candidate sRNA-encoding regions in RNA-Seq data, but has also uncovered many novel candidate sRNA-encoding regions in M. tuberculosis, reinforcing the view that the control of gene expression in bacteria is more complex than previously anticipated.
使用RNA测序技术对细菌转录组进行深度测序,使得在越来越多的原核生物中,在全基因组范围内鉴定小非编码RNA(即响应环境变化调节基因表达的RNA分子)成为可能。然而,在这些大型数据集中缺乏一种简单可靠的自动识别sRNA候选序列的方法。在此,我们从结核分枝杆菌H37Rv的指数生长期培养物中生成转录组后,基于对reads覆盖图谱特征的分析,开发并验证了一种在RNA测序数据集中全基因组鉴定含sRNA候选区域的自动方法。我们在基因间区域鉴定出192个新的候选sRNA编码区域,以及664个从与开放阅读框(ORF)反义(as)区域转录的RNA转录本,这些转录本具有反义RNA(asRNA)的特征,并通过Northern印迹验证了其中28个新的sRNA编码区域。我们的工作不仅提供了一种在RNA测序数据中全基因组鉴定候选sRNA编码区域的简单自动方法,还在结核分枝杆菌中发现了许多新的候选sRNA编码区域,强化了细菌中基因表达调控比以前预期更复杂的观点。