Ebhardt H Alexander, Tsang Herbert H, Dai Denny C, Liu Yifeng, Bostan Babak, Fahlman Richard P
Department of Biochemistry, University of Alberta, Edmonton, AB, T6G 2H7, Canada.
Nucleic Acids Res. 2009 May;37(8):2461-70. doi: 10.1093/nar/gkp093. Epub 2009 Mar 2.
Recent advances in DNA-sequencing technology have made it possible to obtain large datasets of small RNA sequences. Here we demonstrate that not all non-perfectly matched small RNA sequences are simple technological sequencing errors, but many hold valuable biological information. Analysis of three small RNA datasets originating from Oryza sativa and Arabidopsis thaliana small RNA-sequencing projects demonstrates that many single nucleotide substitution errors overlap when aligning homologous non-identical small RNA sequences. Investigating the sites and identities of substitution errors reveal that many potentially originate as a result of post-transcriptional modifications or RNA editing. Modifications include N1-methyl modified purine nucleotides in tRNA, potential deamination or base substitutions in micro RNAs, 3' micro RNA uridine extensions and 5' micro RNA deletions. Additionally, further analysis of large sequencing datasets reveal that the combined effects of 5' deletions and 3' uridine extensions can alter the specificity by which micro RNAs associate with different Argonaute proteins. Hence, we demonstrate that not all sequencing errors in small RNA datasets are technical artifacts, but that these actually often reveal valuable biological insights to the sites of post-transcriptional RNA modifications.
DNA测序技术的最新进展使得获取小RNA序列的大型数据集成为可能。在此,我们证明并非所有不完全匹配的小RNA序列都是简单的技术测序错误,而是许多都包含有价值的生物学信息。对来自水稻和拟南芥小RNA测序项目的三个小RNA数据集的分析表明,在比对同源但不相同的小RNA序列时,许多单核苷酸替换错误会重叠。对替换错误的位点和特征进行研究发现,许多可能源于转录后修饰或RNA编辑。修饰包括tRNA中的N1-甲基修饰嘌呤核苷酸、微小RNA中的潜在脱氨基或碱基替换、3'微小RNA尿苷延伸和5'微小RNA缺失。此外,对大型测序数据集的进一步分析表明,5'缺失和3'尿苷延伸的综合作用可以改变微小RNA与不同AGO蛋白结合的特异性。因此,我们证明小RNA数据集中并非所有测序错误都是技术假象,实际上这些错误常常揭示了转录后RNA修饰位点的有价值的生物学见解。