• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

纳入非规范内含子的剪接位点概率模型可改善植物基因结构预测。

Incorporation of splice site probability models for non-canonical introns improves gene structure prediction in plants.

作者信息

Sparks Michael E, Brendel Volker

机构信息

Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50011-3260, USA.

出版信息

Bioinformatics. 2005 Nov 1;21 Suppl 3:iii20-30. doi: 10.1093/bioinformatics/bti1205.

DOI:10.1093/bioinformatics/bti1205
PMID:16306388
Abstract

MOTIVATION

The vast majority of introns in protein-coding genes of higher eukaryotes have a GT dinucleotide at their 5'-terminus and an AG dinucleotide at their 3' end. About 1-2% of introns are non-canonical, with the most abundant subtype of non-canonical introns being characterized by GC and AG dinucleotides at their 5'- and 3'-termini, respectively. Most current gene prediction software, whether based on ab initio or spliced alignment approaches, does not include explicit models for non-canonical introns or may exclude their prediction altogether. With present amounts of genome and transcript data, it is now possible to apply statistical methodology to non-canonical splice site prediction. We pursued one such approach and describe the training and implementation of GC-donor splice site models for Arabidopsis and rice, with the goal of exploring whether specific modeling of non-canonical introns can enhance gene structure prediction accuracy.

RESULTS

Our results indicate that the incorporation of non-canonical splice site models yields dramatic improvements in annotating genes containing GC-AG and AT-AC non-canonical introns. Comparison of models shows differences between monocot and dicot species, but also suggests GC intron-specific biases independent of taxonomic clade. We also present evidence that GC-AG introns occur preferentially in genes with atypically high exon counts.

AVAILABILITY

Source code for the updated versions of GeneSeqer and SplicePredictor (distributed with the GeneSeqer code) isavailable at http://bioinformatics.iastate.edu/bioinformatics2go/gs/download.html. Web servers for Arabidopsis, rice and other plant species are accessible at http://www.plantgdb.org/PlantGDB-cgi/GeneSeqer/AtGDBgs.cgi, http://www.plantgdb.org/PlantGDB-cgi/GeneSeqer/OsGDBgs.cgi and http://www.plantgdb.org/PlantGDB-cgi/GeneSeqer/PlantGDBgs.cgi, respectively. A SplicePredictor web server is available at http://bioinformatics.iastate.edu/cgi-bin/sp.cgi. Software to generate training data and parameterizations for Bayesian splice site models is available at http://gremlin1.gdcb.iastate.edu/~volker/SB05B/BSSM4GSQ/

摘要

动机

高等真核生物蛋白质编码基因中的绝大多数内含子在其5'端有GT二核苷酸,在其3'端有AG二核苷酸。约1 - 2%的内含子是非典型的,非典型内含子中最丰富的亚型分别在其5'和3'末端以GC和AG二核苷酸为特征。目前大多数基因预测软件,无论是基于从头开始还是剪接比对方法,都不包括非典型内含子的显式模型,或者可能完全排除对它们的预测。利用目前的基因组和转录本数据量,现在可以将统计方法应用于非典型剪接位点预测。我们采用了这样一种方法,并描述了拟南芥和水稻GC供体剪接位点模型的训练和实现,目的是探索非典型内含子的特定建模是否可以提高基因结构预测的准确性。

结果

我们的结果表明,纳入非典型剪接位点模型在注释包含GC - AG和AT - AC非典型内含子的基因方面有显著改进。模型比较显示了单子叶植物和双子叶植物物种之间的差异,但也表明了独立于分类进化枝的GC内含子特异性偏差。我们还提供证据表明,GC - AG内含子优先出现在外显子数量异常高的基因中。

可用性

GeneSeqer和SplicePredictor(与GeneSeqer代码一起分发)的更新版本的源代码可在http://bioinformatics.iastate.edu/bioinformatics2go/gs/download.html获得。拟南芥、水稻和其他植物物种的网络服务器分别可在http://www.plantgdb.org/PlantGDB - cgi/GeneSeqer/AtGDBgs.cgi、http://www.plantgdb.org/PlantGDB - cgi/GeneSeqer/OsGDBgs.cgi和http://www.plantgdb.org/PlantGDB - cgi/GeneSeqer/PlantGDBgs.cgi访问。SplicePredictor网络服务器可在http://bioinformatics.iastate.edu/cgi - bin/sp.cgi获得。用于生成贝叶斯剪接位点模型的训练数据和参数化的软件可在http://gremlin1.gdcb.iastate.edu/~volker/SB05B/BSSM4GSQ/获得。

相似文献

1
Incorporation of splice site probability models for non-canonical introns improves gene structure prediction in plants.纳入非规范内含子的剪接位点概率模型可改善植物基因结构预测。
Bioinformatics. 2005 Nov 1;21 Suppl 3:iii20-30. doi: 10.1093/bioinformatics/bti1205.
2
Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus.基于与同一基因组位点匹配的多个EST的一致性剪接比对进行基因结构预测。
Bioinformatics. 2004 May 1;20(7):1157-69. doi: 10.1093/bioinformatics/bth058. Epub 2004 Feb 5.
3
Common introns within orthologous genes: software and application to plants.同源基因中的常见内含子:软件及其在植物中的应用。
Brief Bioinform. 2009 Nov;10(6):631-44. doi: 10.1093/bib/bbp051.
4
GeneSeqer@PlantGDB: Gene structure prediction in plant genomes.GeneSeqer@PlantGDB:植物基因组中的基因结构预测。
Nucleic Acids Res. 2003 Jul 1;31(13):3597-600. doi: 10.1093/nar/gkg533.
5
Gene structure prediction by spliced alignment of genomic DNA with protein sequences: increased accuracy by differential splice site scoring.通过基因组DNA与蛋白质序列的剪接比对进行基因结构预测:通过差异剪接位点评分提高准确性。
J Mol Biol. 2000 Apr 14;297(5):1075-85. doi: 10.1006/jmbi.2000.3641.
6
PIP: a database of potential intron polymorphism markers.PIP:一个潜在内含子多态性标记的数据库。
Bioinformatics. 2007 Aug 15;23(16):2174-7. doi: 10.1093/bioinformatics/btm296. Epub 2007 Jun 1.
7
JIGSAW: integration of multiple sources of evidence for gene prediction.拼图:用于基因预测的多源证据整合
Bioinformatics. 2005 Sep 15;21(18):3596-603. doi: 10.1093/bioinformatics/bti609. Epub 2005 Aug 2.
8
Optimal spliced alignment of homologous cDNA to a genomic DNA template.同源cDNA与基因组DNA模板的最佳剪接比对。
Bioinformatics. 2000 Mar;16(3):203-11. doi: 10.1093/bioinformatics/16.3.203.
9
Information for the Coordinates of Exons (ICE): a human splice sites database.外显子坐标信息(ICE):一个人类剪接位点数据库。
Genomics. 2004 Oct;84(4):762-6. doi: 10.1016/j.ygeno.2004.05.007.
10
One parameter to describe the mechanism of splice sites competition.一个描述剪接位点竞争机制的参数。
Biochem Biophys Res Commun. 2008 Apr 4;368(2):379-81. doi: 10.1016/j.bbrc.2008.01.089. Epub 2008 Jan 28.

引用本文的文献

1
Animal, Fungi, and Plant Genome Sequences Harbor Different Non-Canonical Splice Sites.动物、真菌和植物基因组序列中存在不同的非规范剪接位点。
Cells. 2020 Feb 18;9(2):458. doi: 10.3390/cells9020458.
2
SplicedFamAlign: CDS-to-gene spliced alignment and identification of transcript orthology groups. splicedFamAlign:CDS 到基因拼接对齐和转录本同源物组的鉴定。
BMC Bioinformatics. 2019 Mar 29;20(Suppl 3):133. doi: 10.1186/s12859-019-2647-2.
3
Genome-wide analyses supported by RNA-Seq reveal non-canonical splice sites in plant genomes.
基于 RNA-Seq 的全基因组分析揭示了植物基因组中的非规范剪接位点。
BMC Genomics. 2018 Dec 29;19(1):980. doi: 10.1186/s12864-018-5360-z.
4
CMT3 and SUVH4/KYP silence the exonic Evelknievel retroelement to allow for reconstitution of CMT1 mRNA.CMT3 和 SUVH4/KYP 使外显子 Evelknievel 反转录元件沉默,从而允许 CMT1 mRNA 的重新组成。
Epigenetics Chromatin. 2018 Nov 16;11(1):69. doi: 10.1186/s13072-018-0240-y.
5
Characterization of viral RNA splicing using whole-transcriptome datasets from host species.利用宿主物种的全转录组数据集进行病毒 RNA 剪接的特征分析。
Sci Rep. 2018 Feb 19;8(1):3273. doi: 10.1038/s41598-018-21190-7.
6
Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence.对非经典剪接位点的考虑改进了对拟南芥 Niederzenz-1 基因组序列的基因预测。
BMC Res Notes. 2017 Dec 4;10(1):667. doi: 10.1186/s13104-017-2985-y.
7
Globally distributed root endophyte Phialocephala subalpina links pathogenic and saprophytic lifestyles.全球分布的根内生真菌亚高山瓶霉连接着致病和腐生的生活方式。
BMC Genomics. 2016 Dec 9;17(1):1015. doi: 10.1186/s12864-016-3369-8.
8
Genome-wide analysis of alternative splicing in Volvox carteri.团藻可变剪接的全基因组分析。
BMC Genomics. 2014 Dec 16;15:1117. doi: 10.1186/1471-2164-15-1117.
9
RNA-seq analysis of the C. briggsae transcriptome.秀丽隐杆线虫转录组的 RNA-seq 分析。
Genome Res. 2012 Aug;22(8):1567-80. doi: 10.1101/gr.134601.111. Epub 2012 Jul 6.
10
MaizeGDB becomes 'sequence-centric'.玉米基因组数据库向“序列为中心”转变。
Database (Oxford). 2009;2009:bap020. doi: 10.1093/database/bap020. Epub 2009 Dec 7.