Suppr超能文献

GMAP:一种用于mRNA和EST序列的基因组图谱绘制与比对程序。

GMAP: a genomic mapping and alignment program for mRNA and EST sequences.

作者信息

Wu Thomas D, Watanabe Colin K

机构信息

Department of Bioinformatics Genentech, Inc., South San Francisco, CA 94080, USA.

出版信息

Bioinformatics. 2005 May 1;21(9):1859-75. doi: 10.1093/bioinformatics/bti310. Epub 2005 Feb 22.

Abstract

MOTIVATION

We introduce GMAP, a standalone program for mapping and aligning cDNA sequences to a genome. The program maps and aligns a single sequence with minimal startup time and memory requirements, and provides fast batch processing of large sequence sets. The program generates accurate gene structures, even in the presence of substantial polymorphisms and sequence errors, without using probabilistic splice site models. Methodology underlying the program includes a minimal sampling strategy for genomic mapping, oligomer chaining for approximate alignment, sandwich DP for splice site detection, and microexon identification with statistical significance testing.

RESULTS

On a set of human messenger RNAs with random mutations at a 1 and 3% rate, GMAP identified all splice sites accurately in over 99.3% of the sequences, which was one-tenth the error rate of existing programs. On a large set of human expressed sequence tags, GMAP provided higher-quality alignments more often than blat did. On a set of Arabidopsis cDNAs, GMAP performed comparably with GeneSeqer. In these experiments, GMAP demonstrated a several-fold increase in speed over existing programs.

AVAILABILITY

Source code for gmap and associated programs is available at http://www.gene.com/share/gmap

SUPPLEMENTARY INFORMATION

http://www.gene.com/share/gmap.

摘要

动机

我们介绍了GMAP,一个用于将cDNA序列映射和比对到基因组的独立程序。该程序以最少的启动时间和内存需求来映射和比对单个序列,并能对大型序列集进行快速批量处理。即使存在大量多态性和序列错误,该程序也能生成准确的基因结构,且不使用概率性剪接位点模型。该程序的基础方法包括用于基因组映射的最小采样策略、用于近似比对的寡聚物链接、用于剪接位点检测的夹心动态规划以及具有统计显著性检验的微外显子识别。

结果

在一组以1%和3%的速率存在随机突变的人类信使RNA上,GMAP在超过99.3%的序列中准确识别了所有剪接位点,这是现有程序错误率的十分之一。在一大组人类表达序列标签上,GMAP比blat更常提供更高质量的比对。在一组拟南芥cDNA上,GMAP的表现与GeneSeqer相当。在这些实验中,GMAP的速度比现有程序提高了几倍。

可用性

gmap及相关程序的源代码可在http://www.gene.com/share/gmap获取。

补充信息

http://www.gene.com/share/gmap。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验