Suppr超能文献

IsoSplitter:无需参考基因组即可识别和表征可变剪接位点

IsoSplitter: identification and characterization of alternative splicing sites without a reference genome.

作者信息

Wang Yupeng, Hu Zhikang, Ye Ning, Yin Hengfu

机构信息

College of Information Science and Technology, Nanjing Forestry University, Nanjing, China.

State Key Laboratory of Tree Genetics and Breeding, Research Institute of Subtropical Forestry, Chinese Academy of Forestry, Hangzhou, Zhejiang 311400, China.

出版信息

RNA. 2021 May 21;27(8):868-75. doi: 10.1261/rna.077834.120.

Abstract

Long-read transcriptome sequencing is designed to sequence full-length RNA molecules and advantageous for identifying alternative splice isoforms; however, in the absence of a reference genome, it is difficult to accurately locate splice sites, because of the diversity of patterns of alternative splicing (AS). Based on long-read transcriptome data we developed a versatile tool, IsoSplitter, to reverse-trace and validate AS gene "split-sites" with the following features: (1) IsoSplitter initially invokes a modified SIM4 program to find transcript split-sites; (2) each split-site is then quantified, to reveal transcript diversity, and putative isoforms are grouped into gene clusters; (3) an optional step for aligning short-reads is provided, to validate split-sites by identifying unique junction reads, and revealing and quantifying tissue-specific alternative splice isoforms. We tested IsoSplitter AS prediction using datasets from multiple model and non-model plant species, and showed that IsoSplitter pipeline is efficient to handle different transcriptomes with high accuracy. Furthermore, we evaluated the IsoSplitter pipeline compared with that of the splice junction identification tools, Program to Assemble Spliced Alignments (PASA-software needs a reference genome for AS identification) and AStrap, using data from the model plant Arabidopsis thaliana. We found that, IsoSplitter determined more than twice as many AS events than AStrap analysis; and 94.13% of the IsoSplitter predicted AS events were also identified by the PASA analysis. Starting from a simple sequence file, IsoSplitter is an assembly-free tool for identification and characterization of AS. IsoSplitter is developed and implemented in Python 3.5 using the Linux platform and is freely available at https://github.com/Hengfu-Yin/IsoSplitter.

摘要

长读长转录组测序旨在对全长RNA分子进行测序,有利于识别可变剪接异构体;然而,在没有参考基因组的情况下,由于可变剪接(AS)模式的多样性,很难准确定位剪接位点。基于长读长转录组数据,我们开发了一种通用工具IsoSplitter,用于反向追踪和验证AS基因的“分裂位点”,其具有以下特点:(1)IsoSplitter最初调用一个修改后的SIM4程序来查找转录本分裂位点;(2)然后对每个分裂位点进行量化,以揭示转录本多样性,并将推定的异构体分组到基因簇中;(3)提供了一个比对短读长的可选步骤,通过识别独特的连接读段来验证分裂位点,并揭示和量化组织特异性可变剪接异构体。我们使用来自多个模式植物和非模式植物物种的数据集测试了IsoSplitter的AS预测,结果表明IsoSplitter流程能够高效且准确地处理不同的转录组。此外,我们使用模式植物拟南芥的数据,将IsoSplitter流程与剪接连接识别工具(用于组装剪接比对的程序(PASA软件需要参考基因组进行AS识别))和AStrap进行了比较评估。我们发现,IsoSplitter确定的AS事件数量是AStrap分析的两倍多;PASA分析也识别出了IsoSplitter预测的AS事件中的94.13%。从一个简单的序列文件开始,IsoSplitter是一个无需组装的工具,用于识别和表征AS。IsoSplitter是使用Linux平台在Python 3.5中开发和实现的,可在https://github.com/Hengfu-Yin/IsoSplitter上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7581/8284324/cb49303be8a6/868f01.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验