Suppr超能文献

通过整合番茄转录数据的多种来源来扩展可变剪接识别

Expanding Alternative Splicing Identification by Integrating Multiple Sources of Transcription Data in Tomato.

作者信息

Clark Sarah, Yu Feng, Gu Lianfeng, Min Xiang Jia

机构信息

Department of Biological Sciences, Youngstown State University, Youngstown, OH, United States.

Department of Computer Science and Information Systems, Youngstown State University, Youngstown, OH, United States.

出版信息

Front Plant Sci. 2019 May 28;10:689. doi: 10.3389/fpls.2019.00689. eCollection 2019.

Abstract

Tomato () is an important vegetable and fruit crop. Its genome was completely sequenced and there are also a large amount of available expressed sequence tags (ESTs) and short reads generated by RNA sequencing (RNA-seq) technologies. Mapping transcripts including mRNA sequences, ESTs, and RNA-seq reads to the genome allows identifying pre-mRNA alternative splicing (AS), a post-transcriptional process generating two or more RNA isoforms from one pre-mRNA transcript. We comprehensively analyzed the AS landscape in tomato by integrating genome mapping information of all available mRNA and ESTs with mapping information of RNA-seq reads which were collected from 27 published projects. A total of 369,911 AS events were identified from 34,419 genomic loci involving 161,913 transcripts. Within the basic AS events, intron retention is the prevalent type (18.9%), followed by alternative acceptor site (12.9%) and alternative donor site (7.3%), with exon skipping as the least type (6.0%). Complex AS types having two or more basic event accounted for 54.9% of total AS events. Within 35,768 annotated protein-coding gene models, 23,233 gene models were found having pre-mRNAs generating AS isoform transcripts. Thus the estimated AS rate was 65.0% in tomato. The list of identified AS genes with their corresponding transcript isoforms serves as a catalog for further detailed examination of gene functions in tomato biology. The post-transcriptional information is also expected to be useful in improving the predicted gene models in tomato. The sequence and annotation information can be accessed at plant alternative splicing database (http://proteomics.ysu.edu/altsplice).

摘要

番茄( )是一种重要的蔬菜和水果作物。其基因组已被完全测序,并且还有大量可用的表达序列标签(EST)以及通过RNA测序(RNA-seq)技术产生的短读段。将包括mRNA序列、EST和RNA-seq读段在内的转录本映射到基因组上,有助于识别前体mRNA可变剪接(AS),这是一种转录后过程,可从一个前体mRNA转录本产生两个或更多的RNA异构体。我们通过整合所有可用mRNA和EST的基因组映射信息与从27个已发表项目中收集的RNA-seq读段的映射信息,全面分析了番茄中的AS情况。从34,419个基因组位点共鉴定出369,911个AS事件,涉及161,913个转录本。在基本的AS事件中,内含子保留是最常见的类型(18.9%),其次是可变受体位点(12.9%)和可变供体位点(7.3%),外显子跳跃是最少见的类型(6.0%)。具有两个或更多基本事件的复杂AS类型占总AS事件的54.9%。在35,768个注释的蛋白质编码基因模型中,发现有23,233个基因模型的前体mRNA产生AS异构体转录本。因此,估计番茄中的AS率为65.0%。已鉴定的AS基因及其相应转录本异构体的列表可作为进一步详细研究番茄生物学中基因功能的目录。转录后信息也有望用于改进番茄中预测的基因模型。序列和注释信息可在植物可变剪接数据库(http://proteomics.ysu.edu/altsplice)中获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a040/6546887/cd8c4d7f55bb/fpls-10-00689-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验