• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GAME:一种使用最大精确匹配过滤的简单高效的全基因组比对方法。

GAME: a simple and efficient whole genome alignment method using maximal exact match filtering.

作者信息

Choi Jeong-Hyeon, Cho Hwan-Gue, Kim Sun

机构信息

School of Informatics, Indiana University, Bloomington, IN 47408, USA.

出版信息

Comput Biol Chem. 2005 Jun;29(3):244-53. doi: 10.1016/j.compbiolchem.2005.04.004.

DOI:10.1016/j.compbiolchem.2005.04.004
PMID:15979044
Abstract

In this paper, we present a simple and efficient whole genome alignment method using maximal exact match (MEM). The major problem with the use of MEM anchor is that the number of hits in non-homologous regions increases exponentially when shorter MEM anchors are used to detect more homologous regions. To deal with this problem, we have developed a fast and accurate anchor filtering scheme based on simple match extension with minimum percent identity and extension length criteria. Due to its simplicity and accuracy, all MEM anchors in a pair of genomes can be exhaustively tested and filtered. In addition, by incorporating the translation technique, the alignment quality and speed of our genome alignment algorithm have been further improved. As a result, our genome alignment algorithm, GAME (Genome Alignment by Match Extension), performs competitively over existing algorithms and can align large whole genomes, e.g., A. thaliana, without the requirement of typical large memory and parallel processors. This is shown using an experiment which compares the performance of BLAST, BLASTZ, PatternHunter, MUMmer and our algorithm in aligning all 45 pairs of 10 microbial genomes. The scalability of our algorithm is shown in another experiment where all pairs of five chromosomes in A. thaliana were compared.

摘要

在本文中,我们提出了一种使用最大精确匹配(MEM)的简单高效的全基因组比对方法。使用MEM锚点的主要问题在于,当使用更短的MEM锚点来检测更多同源区域时,非同源区域中的命中数会呈指数增长。为了解决这个问题,我们基于具有最小百分比一致性和延伸长度标准的简单匹配延伸,开发了一种快速且准确的锚点过滤方案。由于其简单性和准确性,可以对一对基因组中的所有MEM锚点进行详尽的测试和过滤。此外,通过纳入翻译技术,我们的基因组比对算法的比对质量和速度得到了进一步提高。结果,我们的基因组比对算法GAME(通过匹配延伸进行基因组比对)在与现有算法的竞争中表现出色,并且能够比对大型全基因组,例如拟南芥基因组,而无需典型的大内存和并行处理器。这通过一个实验得到了证明,该实验比较了BLAST、BLASTZ、PatternHunter、MUMmer和我们的算法在比对10个微生物基因组的所有45对基因组时的性能。我们算法的可扩展性在另一个实验中得到了展示,该实验比较了拟南芥五条染色体的所有基因组对。

相似文献

1
GAME: a simple and efficient whole genome alignment method using maximal exact match filtering.GAME:一种使用最大精确匹配过滤的简单高效的全基因组比对方法。
Comput Biol Chem. 2005 Jun;29(3):244-53. doi: 10.1016/j.compbiolchem.2005.04.004.
2
SGA: a grammar-based alignment algorithm.SGA:一种基于语法的比对算法。
Comput Methods Programs Biomed. 2007 Apr;86(1):17-20. doi: 10.1016/j.cmpb.2006.12.007. Epub 2007 Jan 30.
3
A space-efficient algorithm for the constrained pairwise sequence alignment problem.一种用于受限成对序列比对问题的节省空间的算法。
Genome Inform. 2005;16(2):237-46.
4
How to usefully compare homologous plant genes and chromosomes as DNA sequences.如何将同源植物基因和染色体作为DNA序列进行有效比较。
Plant J. 2008 Feb;53(4):661-73. doi: 10.1111/j.1365-313X.2007.03326.x.
5
Compressed indexing and local alignment of DNA.DNA的压缩索引与局部比对
Bioinformatics. 2008 Mar 15;24(6):791-7. doi: 10.1093/bioinformatics/btn032. Epub 2008 Jan 28.
6
Murlet: a practical multiple alignment tool for structural RNA sequences.Murlet:一种用于结构RNA序列的实用多序列比对工具。
Bioinformatics. 2007 Jul 1;23(13):1588-98. doi: 10.1093/bioinformatics/btm146. Epub 2007 Apr 25.
7
A novel feature-based method for whole genome phylogenetic analysis without alignment: application to HEV genotyping and subtyping.一种用于全基因组系统发育分析的无需比对的基于特征的新方法:在戊型肝炎病毒基因分型和亚型分析中的应用。
Biochem Biophys Res Commun. 2008 Apr 4;368(2):223-30. doi: 10.1016/j.bbrc.2008.01.070. Epub 2008 Jan 28.
8
Ancestral sequence alignment under optimal conditions.在最佳条件下进行祖先序列比对。
BMC Bioinformatics. 2005 Nov 17;6:273. doi: 10.1186/1471-2105-6-273.
9
Multiple mapping method: a novel approach to the sequence-to-structure alignment problem in comparative protein structure modeling.多重映射方法:比较蛋白质结构建模中序列到结构比对问题的一种新方法。
Proteins. 2006 May 15;63(3):644-61. doi: 10.1002/prot.20835.
10
Accurate identification of orthologous segments among multiple genomes.准确识别多个基因组之间的直系同源片段。
Bioinformatics. 2009 Apr 1;25(7):853-60. doi: 10.1093/bioinformatics/btp070. Epub 2009 Feb 2.

引用本文的文献

1
DART: a fast and accurate RNA-seq mapper with a partitioning strategy.DART:一种采用分区策略的快速且准确的RNA测序映射器。
Bioinformatics. 2018 Jan 15;34(2):190-197. doi: 10.1093/bioinformatics/btx558.
2
NucDiff: in-depth characterization and annotation of differences between two sets of DNA sequences.NucDiff:两组DNA序列之间差异的深入表征与注释。
BMC Bioinformatics. 2017 Jul 12;18(1):338. doi: 10.1186/s12859-017-1748-z.
3
Long read alignment based on maximal exact match seeds.基于最大精确匹配种子的长读比对。
Bioinformatics. 2012 Sep 15;28(18):i318-i324. doi: 10.1093/bioinformatics/bts414.
4
Genome-wide DNA methylation maps in follicular lymphoma cells determined by methylation-enriched bisulfite sequencing.通过甲基化富集亚硫酸氢盐测序确定滤泡性淋巴瘤细胞中的全基因组 DNA 甲基化图谱。
PLoS One. 2010 Sep 29;5(9):e13020. doi: 10.1371/journal.pone.0013020.
5
A practical algorithm for finding maximal exact matches in large sequence datasets using sparse suffix arrays.一种使用稀疏后缀数组在大型序列数据集中查找最大精确匹配的实用算法。
Bioinformatics. 2009 Jul 1;25(13):1609-16. doi: 10.1093/bioinformatics/btp275. Epub 2009 Apr 23.
6
De novo identification of LTR retrotransposons in eukaryotic genomes.真核生物基因组中LTR反转录转座子的从头鉴定。
BMC Genomics. 2007 Apr 3;8:90. doi: 10.1186/1471-2164-8-90.
7
CGAT: a comparative genome analysis tool for visualizing alignments in the analysis of complex evolutionary changes between closely related genomes.CGAT:一种用于在分析密切相关基因组之间复杂进化变化时可视化比对结果的比较基因组分析工具。
BMC Bioinformatics. 2006 Oct 24;7:472. doi: 10.1186/1471-2105-7-472.