Suppr超能文献

使用Read-Split-Fly鉴定全基因组非经典剪接区域并分析剪接序列的生物学功能。

Identification of genome-wide non-canonical spliced regions and analysis of biological functions for spliced sequences using Read-Split-Fly.

作者信息

Bai Yongsheng, Kinne Jeff, Ding Lizhong, Rath Ethan C, Cox Aaron, Naidu Siva Dharman

机构信息

Department of Biology, Indiana State University, 600 Chestnut Street, Terre Haute, IN, 47809, USA.

The Center for Genomic Advocacy, Indiana State University, 600 Chestnut Street, Terre Haute, IN, 47809, USA.

出版信息

BMC Bioinformatics. 2017 Oct 3;18(Suppl 11):382. doi: 10.1186/s12859-017-1801-y.

Abstract

BACKGROUND

It is generally thought that most canonical or non-canonical splicing events involving U2- and U12 spliceosomes occur within nuclear pre-mRNAs. However, the question of whether at least some U12-type splicing occurs in the cytoplasm is still unclear. In recent years next-generation sequencing technologies have revolutionized the field. The "Read-Split-Walk" (RSW) and "Read-Split-Run" (RSR) methods were developed to identify genome-wide non-canonical spliced regions including special events occurring in cytoplasm. As the significant amount of genome/transcriptome data such as, Encyclopedia of DNA Elements (ENCODE) project, have been generated, we have advanced a newer more memory-efficient version of the algorithm, "Read-Split-Fly" (RSF), which can detect non-canonical spliced regions with higher sensitivity and improved speed. The RSF algorithm also outputs the spliced sequences for further downstream biological function analysis.

RESULTS

We used open access ENCODE project RNA-Seq data to search spliced intron sequences against the U12-type spliced intron sequence database to examine whether some events could occur as potential signatures of U12-type splicing. The check was performed by searching spliced sequences against 5'ss and 3'ss sequences from the well-known orthologous U12-type spliceosomal intron database U12DB. Preliminary results of searching 70 ENCODE samples indicated that the presence of 5'ss with U12-type signature is more frequent than U2-type and prevalent in non-canonical junctions reported by RSF. The selected spliced sequences have also been further studied using miRBase to elucidate their functionality. Preliminary results from 70 samples of ENCODE datasets show that several miRNAs are prevalent in studied ENCODE samples. Two of these are associated with many diseases as suggested in the literature. Specifically, hsa-miR-1273 and hsa-miR-548 are associated with many diseases and cancers.

CONCLUSIONS

Our RSF pipeline is able to detect many possible junctions (especially those with a high RPKM) with very high overall accuracy and relative high accuracy for novel junctions. We have incorporated useful parameter features into the pipeline such as, handling variable-length read data, and searching spliced sequences for splicing signatures and miRNA events. We suggest RSF, a tool for identifying novel splicing events, is applicable to study a range of diseases across biological systems under different experimental conditions.

摘要

背景

一般认为,大多数涉及U2和U12剪接体的经典或非经典剪接事件发生在核内前体mRNA中。然而,至少一些U12型剪接是否发生在细胞质中仍不清楚。近年来,下一代测序技术给该领域带来了变革。“读-分割-游走”(RSW)和“读-分割-运行”(RSR)方法被开发出来,用于识别全基因组范围内的非经典剪接区域,包括发生在细胞质中的特殊事件。随着大量基因组/转录组数据(如DNA元件百科全书(ENCODE)项目)的产生,我们改进了一种更新的、更节省内存的算法版本“读-分割-飞行”(RSF),它能以更高的灵敏度和更快的速度检测非经典剪接区域。RSF算法还输出剪接序列,用于进一步的下游生物学功能分析。

结果

我们使用公开获取的ENCODE项目RNA-Seq数据,针对U12型剪接内含子序列数据库搜索剪接后的内含子序列,以检查是否存在一些可能作为U12型剪接潜在特征的事件。通过将剪接序列与著名的直系同源U12型剪接体内含子数据库U12DB中的5'剪接位点(5'ss)和3'剪接位点(3'ss)序列进行比对来进行检查。对70个ENCODE样本的初步搜索结果表明,具有U12型特征的5'ss在RSF报告的非经典连接中比U2型更常见且普遍存在。所选的剪接序列还使用miRBase进行了进一步研究,以阐明其功能。ENCODE数据集70个样本的初步结果表明,几种微小RNA(miRNA)在所研究的ENCODE样本中普遍存在。其中两种如文献所提示与许多疾病相关。具体而言,hsa - miR - 1273和hsa - miR - 548与许多疾病和癌症相关。

结论

我们的RSF流程能够以非常高的总体准确率和相对较高的新连接准确率检测到许多可能的连接(尤其是那些具有高每百万映射读取中每千碱基的读取数(RPKM)的连接)。我们已将有用的参数特征纳入该流程,如处理可变长度读取数据,以及搜索剪接序列中的剪接特征和miRNA事件。我们认为RSF作为一种识别新剪接事件的工具,适用于在不同实验条件下研究跨生物系统的一系列疾病。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a76/5629565/07c3871df605/12859_2017_1801_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验