Suppr超能文献

GMAP:一种用于mRNA和EST序列的基因组图谱绘制与比对程序。

GMAP: a genomic mapping and alignment program for mRNA and EST sequences.

作者信息

Wu Thomas D, Watanabe Colin K

机构信息

Department of Bioinformatics Genentech, Inc., South San Francisco, CA 94080, USA.

出版信息

Bioinformatics. 2005 May 1;21(9):1859-75. doi: 10.1093/bioinformatics/bti310. Epub 2005 Feb 22.

Abstract

MOTIVATION

We introduce GMAP, a standalone program for mapping and aligning cDNA sequences to a genome. The program maps and aligns a single sequence with minimal startup time and memory requirements, and provides fast batch processing of large sequence sets. The program generates accurate gene structures, even in the presence of substantial polymorphisms and sequence errors, without using probabilistic splice site models. Methodology underlying the program includes a minimal sampling strategy for genomic mapping, oligomer chaining for approximate alignment, sandwich DP for splice site detection, and microexon identification with statistical significance testing.

RESULTS

On a set of human messenger RNAs with random mutations at a 1 and 3% rate, GMAP identified all splice sites accurately in over 99.3% of the sequences, which was one-tenth the error rate of existing programs. On a large set of human expressed sequence tags, GMAP provided higher-quality alignments more often than blat did. On a set of Arabidopsis cDNAs, GMAP performed comparably with GeneSeqer. In these experiments, GMAP demonstrated a several-fold increase in speed over existing programs.

AVAILABILITY

Source code for gmap and associated programs is available at http://www.gene.com/share/gmap

SUPPLEMENTARY INFORMATION

http://www.gene.com/share/gmap.

摘要

动机

我们介绍了GMAP,一个用于将cDNA序列映射和比对到基因组的独立程序。该程序以最少的启动时间和内存需求来映射和比对单个序列,并能对大型序列集进行快速批量处理。即使存在大量多态性和序列错误,该程序也能生成准确的基因结构,且不使用概率性剪接位点模型。该程序的基础方法包括用于基因组映射的最小采样策略、用于近似比对的寡聚物链接、用于剪接位点检测的夹心动态规划以及具有统计显著性检验的微外显子识别。

结果

在一组以1%和3%的速率存在随机突变的人类信使RNA上,GMAP在超过99.3%的序列中准确识别了所有剪接位点,这是现有程序错误率的十分之一。在一大组人类表达序列标签上,GMAP比blat更常提供更高质量的比对。在一组拟南芥cDNA上,GMAP的表现与GeneSeqer相当。在这些实验中,GMAP的速度比现有程序提高了几倍。

可用性

gmap及相关程序的源代码可在http://www.gene.com/share/gmap获取。

补充信息

http://www.gene.com/share/gmap。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验