• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Sim4cc:一种跨物种剪接比对程序。

Sim4cc: a cross-species spliced alignment program.

作者信息

Zhou Leming, Pertea Mihaela, Delcher Arthur L, Florea Liliana

机构信息

Department of Computer Science, George Washington University, Washington, DC 20052, USA.

出版信息

Nucleic Acids Res. 2009 Jun;37(11):e80. doi: 10.1093/nar/gkp319. Epub 2009 May 8.

DOI:10.1093/nar/gkp319
PMID:19429899
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2699533/
Abstract

Advances in sequencing technologies have accelerated the sequencing of new genomes, far outpacing the generation of gene and protein resources needed to annotate them. Direct comparison and alignment of existing cDNA sequences from a related species is an effective and readily available means to determine genes in the new genomes. Current spliced alignment programs are inadequate for comparing sequences between different species, owing to their low sensitivity and splice junction accuracy. A new spliced alignment tool, sim4cc, overcomes problems in the earlier tools by incorporating three new features: universal spaced seeds, to increase sensitivity and allow comparisons between species at various evolutionary distances, and powerful splice signal models and evolutionarily-aware alignment techniques, to improve the accuracy of gene models. When tested on vertebrate comparisons at diverse evolutionary distances, sim4cc had significantly higher sensitivity compared to existing alignment programs, more than 10% higher than the closest competitor for some comparisons, while being comparable in speed to its predecessor, sim4. Sim4cc can be used in one-to-one or one-to-many comparisons of genomic and cDNA sequences, and can also be effectively incorporated into a high-throughput annotation engine, as demonstrated by the mapping of 64,000 Fagus grandifolia 454 ESTs and unigenes to the poplar genome.

摘要

测序技术的进步加速了新基因组的测序,其速度远远超过注释这些基因组所需的基因和蛋白质资源的生成速度。直接比较和比对来自相关物种的现有cDNA序列是确定新基因组中基因的一种有效且易于获得的方法。由于当前的剪接比对程序灵敏度低且剪接位点准确性差,因此不足以用于比较不同物种之间的序列。一种新的剪接比对工具sim4cc通过融入三个新特性克服了早期工具存在的问题:通用间隔种子,用于提高灵敏度并允许比较不同进化距离的物种;强大的剪接信号模型和具有进化意识的比对技术,用于提高基因模型的准确性。在对不同进化距离的脊椎动物进行比较测试时,与现有的比对程序相比,sim4cc具有显著更高的灵敏度,在某些比较中比最接近的竞争对手高出10%以上,而其速度与前身sim4相当。Sim4cc可用于基因组序列和cDNA序列的一对一或一对多比较,并且还可以有效地整合到高通量注释引擎中,如将64,000个大叶水青冈454 EST和单基因定位到杨树基因组所证明的那样。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b8c/2699533/fdccc9d06dcf/gkp319f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b8c/2699533/31d90075cb15/gkp319f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b8c/2699533/7c509d6bfc44/gkp319f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b8c/2699533/fdccc9d06dcf/gkp319f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b8c/2699533/31d90075cb15/gkp319f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b8c/2699533/7c509d6bfc44/gkp319f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b8c/2699533/fdccc9d06dcf/gkp319f3.jpg

相似文献

1
Sim4cc: a cross-species spliced alignment program.Sim4cc:一种跨物种剪接比对程序。
Nucleic Acids Res. 2009 Jun;37(11):e80. doi: 10.1093/nar/gkp319. Epub 2009 May 8.
2
Sim4db and Leaff: utilities for fast batch spliced alignment and sequence indexing.Sim4db 和 Leaff:用于快速批量拼接比对和序列索引的实用程序。
Bioinformatics. 2011 Jul 1;27(13):1869-70. doi: 10.1093/bioinformatics/btr285. Epub 2011 May 6.
3
Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus.基于与同一基因组位点匹配的多个EST的一致性剪接比对进行基因结构预测。
Bioinformatics. 2004 May 1;20(7):1157-69. doi: 10.1093/bioinformatics/bth058. Epub 2004 Feb 5.
4
Supersplat--spliced RNA-seq alignment.超拼接--拼接 RNA-seq 比对。
Bioinformatics. 2010 Jun 15;26(12):1500-5. doi: 10.1093/bioinformatics/btq206. Epub 2010 Apr 21.
5
Gene structure prediction by spliced alignment of genomic DNA with protein sequences: increased accuracy by differential splice site scoring.通过基因组DNA与蛋白质序列的剪接比对进行基因结构预测:通过差异剪接位点评分提高准确性。
J Mol Biol. 2000 Apr 14;297(5):1075-85. doi: 10.1006/jmbi.2000.3641.
6
SplicedFamAlign: CDS-to-gene spliced alignment and identification of transcript orthology groups. splicedFamAlign:CDS 到基因拼接对齐和转录本同源物组的鉴定。
BMC Bioinformatics. 2019 Mar 29;20(Suppl 3):133. doi: 10.1186/s12859-019-2647-2.
7
Spaln3: improvement in speed and accuracy of genome mapping and spliced alignment of protein query sequences.Spaln3:提高基因组作图和蛋白质查询序列拼接比对的速度和准确性。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae517.
8
Cooperation of Spaln and Prrn5 for Construction of Gene-Structure-Aware Multiple Sequence Alignment.Spaln和Prrn5在构建基因结构感知多序列比对中的合作。
Methods Mol Biol. 2021;2231:71-88. doi: 10.1007/978-1-0716-1036-7_5.
9
A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequence.一种用于将cDNA序列定位和比对到基因组序列上的节省空间且准确的方法。
Nucleic Acids Res. 2008 May;36(8):2630-8. doi: 10.1093/nar/gkn105. Epub 2008 Mar 15.
10
Spidey: a tool for mRNA-to-genomic alignments.蜘蛛侠:一种用于信使核糖核酸到基因组比对的工具。
Genome Res. 2001 Nov;11(11):1952-7. doi: 10.1101/gr.195301.

引用本文的文献

1
IsoSplitter: identification and characterization of alternative splicing sites without a reference genome.IsoSplitter:无需参考基因组即可识别和表征可变剪接位点
RNA. 2021 May 21;27(8):868-75. doi: 10.1261/rna.077834.120.
2
A new rhesus macaque assembly and annotation for next-generation sequencing analyses.用于下一代测序分析的恒河猴新基因组组装与注释。
Biol Direct. 2014 Oct 14;9(1):20. doi: 10.1186/1745-6150-9-20.
3
Rapid evolution of PARP genes suggests a broad role for ADP-ribosylation in host-virus conflicts.PARP基因的快速进化表明ADP核糖基化在宿主与病毒的冲突中发挥着广泛作用。

本文引用的文献

1
Effective cluster-based seed design for cross-species sequence comparisons.用于跨物种序列比较的基于聚类的有效种子设计。
Bioinformatics. 2008 Dec 15;24(24):2926-7. doi: 10.1093/bioinformatics/btn547. Epub 2008 Oct 20.
2
Genome analysis of the platypus reveals unique signatures of evolution.鸭嘴兽的基因组分析揭示了独特的进化特征。
Nature. 2008 May 8;453(7192):175-83. doi: 10.1038/nature06936.
3
Universal seeds for cDNA-to-genome comparison.用于cDNA与基因组比较的通用种子序列。
PLoS Genet. 2014 May 29;10(5):e1004403. doi: 10.1371/journal.pgen.1004403. eCollection 2014.
4
Improving genome assemblies and annotations for nonhuman primates.改进非人类灵长类动物的基因组组装和注释。
ILAR J. 2013;54(2):144-53. doi: 10.1093/ilar/ilt037.
5
Comparative evaluation of intron prediction methods and detection of plant genome annotation using intron length distributions.基于内含子长度分布的内含子预测方法比较评估及植物基因组注释检测
Genomics Inform. 2012 Mar;10(1):58-64. doi: 10.5808/GI.2012.10.1.58. Epub 2012 Mar 31.
6
Benchmarking spliced alignment programs including Spaln2, an extended version of Spaln that incorporates additional species-specific features.对拼接比对程序进行基准测试,包括 Spaln2,这是 Spaln 的扩展版本,其中包含了额外的特定于物种的特征。
Nucleic Acids Res. 2012 Nov 1;40(20):e161. doi: 10.1093/nar/gks708. Epub 2012 Jul 30.
7
HaploMerger: reconstructing allelic relationships for polymorphic diploid genome assemblies.HaploMerger:重构多态二倍体基因组组装的等位基因关系。
Genome Res. 2012 Aug;22(8):1581-8. doi: 10.1101/gr.133652.111. Epub 2012 May 3.
8
Detection of lineage-specific evolutionary changes among primate species.检测灵长类物种之间特定谱系的进化变化。
BMC Bioinformatics. 2011 Jul 4;12:274. doi: 10.1186/1471-2105-12-274.
9
Sim4db and Leaff: utilities for fast batch spliced alignment and sequence indexing.Sim4db 和 Leaff:用于快速批量拼接比对和序列索引的实用程序。
Bioinformatics. 2011 Jul 1;27(13):1869-70. doi: 10.1093/bioinformatics/btr285. Epub 2011 May 6.
10
Efficient plant gene identification based on interspecies mapping of full-length cDNAs.基于全长 cDNA 的种间图谱进行有效的植物基因鉴定。
DNA Res. 2010 Oct;17(5):271-9. doi: 10.1093/dnares/dsq017. Epub 2010 Jul 28.
BMC Bioinformatics. 2008 Jan 23;9:36. doi: 10.1186/1471-2105-9-36.
4
Database resources of the National Center for Biotechnology Information.美国国立生物技术信息中心的数据库资源。
Nucleic Acids Res. 2008 Jan;36(Database issue):D13-21. doi: 10.1093/nar/gkm1000. Epub 2007 Nov 27.
5
The vertebrate genome annotation (Vega) database.脊椎动物基因组注释(Vega)数据库。
Nucleic Acids Res. 2008 Jan;36(Database issue):D753-60. doi: 10.1093/nar/gkm987. Epub 2007 Nov 14.
6
Features generated for computational splice-site prediction correspond to functional elements.为计算剪接位点预测生成的特征对应于功能元件。
BMC Bioinformatics. 2007 Oct 24;8:410. doi: 10.1186/1471-2105-8-410.
7
Designing sensitive and specific spaced seeds for cross-species mRNA-to-genome alignment.设计用于跨物种mRNA到基因组比对的灵敏且特异的间隔种子。
J Comput Biol. 2007 Mar;14(2):113-30. doi: 10.1089/cmb.2006.0130.
8
Improvement of whole-genome annotation of cereals through comparative analyses.通过比较分析改进谷物的全基因组注释
Genome Res. 2007 Mar;17(3):299-310. doi: 10.1101/gr.5881807. Epub 2007 Feb 6.
9
Identifying bacterial genes and endosymbiont DNA with Glimmer.使用Glimmer识别细菌基因和内共生体DNA。
Bioinformatics. 2007 Mar 15;23(6):673-9. doi: 10.1093/bioinformatics/btm009. Epub 2007 Jan 19.
10
NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.美国国立生物技术信息中心参考序列(RefSeq):一个经过整理的基因组、转录本和蛋白质的非冗余序列数据库。
Nucleic Acids Res. 2007 Jan;35(Database issue):D61-5. doi: 10.1093/nar/gkl842. Epub 2006 Nov 27.