• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将多个多序列比对合并为一个改进的比对。

Combining many multiple alignments in one improved alignment.

作者信息

Bucka-Lassen K, Caprani O, Hein J

机构信息

Object Oriented Ltd, 6004 Luzern, Switzerland, Department of Computer Science and Department of Ecology and Genetics, University of Aarhus, 8000 Aarhus C, Denmark.

出版信息

Bioinformatics. 1999 Feb;15(2):122-30. doi: 10.1093/bioinformatics/15.2.122.

DOI:10.1093/bioinformatics/15.2.122
PMID:10089197
Abstract

MOTIVATION

The fact that the multiple sequence alignment problem is of high complexity has led to many different heuristic algorithms attempting to find a solution in what would be considered a reasonable amount of computation time and space. Very few of these heuristics produce results that are guaranteed always to lie within a certain distance of an optimal solution (given a measure of quality, e.g. parsimony). Most practical heuristics cannot guarantee this, but nevertheless perform well for certain cases. An alignment, obtained with one of these heuristics and with a bad overall score, is not unusable though, it might contain important information on how substrings should be aligned. This paper presents a method that extracts qualitatively good sub-alignments from a set of multiple alignments and combines these into a new, often improved alignment. The algorithm is implemented as a variant of the traditional dynamic programming technique.

RESULTS

An implementation of ComAlign (the algorithm that combines multiple alignments) has been run on several sets of artificially generated sequences and a set of 5S RNA sequences. To assess the quality of the alignments obtained, the results have been compared with the output of MSA 2.1 (Gupta et al., Proceedings of the Sixth Annual Symposium on Combinatorial Pattern Matching, 1995; Kececioglu et al., http://www.techfak.uni-bielefeld. de/bcd/Lectures/kececioglu.html, 1995). In all cases, ComAlign was able to produce a solution with a score comparable to the solution obtained by MSA. The results also show that ComAlign actually does combine parts from different alignments and not just select the best of them.

AVAILABILITY

The C source code (a Smalltalk version is being worked on) of ComAlign and the other programs that have been implemented in this context are free and available on WWW (http://www.daimi.au.dk/ õcaprani).

CONTACT

klaus@bucka-lassen.dk; jotun@pop.bio.au.dk;ocaprani@daimi.au.dk

摘要

动机

多重序列比对问题具有高复杂性这一事实,导致许多不同的启发式算法试图在被认为合理的计算时间和空间内找到解决方案。这些启发式算法中很少有能保证其结果总是处于最优解一定距离范围内的(给定质量度量,例如简约性)。大多数实用的启发式算法无法保证这一点,但在某些情况下仍表现良好。通过这些启发式算法之一获得的且总体得分较差的比对并非不可用,它可能包含有关子串应如何比对的重要信息。本文提出了一种方法,该方法从一组多重比对中提取质量上良好的子比对,并将它们组合成一个新的、通常有所改进的比对。该算法是作为传统动态规划技术的一种变体实现的。

结果

ComAlign(组合多重比对的算法)的一个实现版本已在几组人工生成的序列以及一组5S RNA序列上运行。为了评估所获得比对的质量,已将结果与MSA 2.1的输出进行了比较(Gupta等人,《第六届组合模式匹配年度研讨会论文集》,1995年;Kececioglu等人,http://www.techfak.uni-bielefeld.de/bcd/Lectures/kececioglu.html,1995年)。在所有情况下,ComAlign都能够产生一个得分与MSA获得的解相当的解。结果还表明,ComAlign实际上确实组合了来自不同比对的部分,而不仅仅是选择其中最好的。

可用性

ComAlign的C源代码(正在开发一个Smalltalk版本)以及在此背景下实现的其他程序可在万维网上免费获取(http://www.daimi.au.dk/õcaprani)。

联系方式

klaus@bucka-lassen.dk;jotun@pop.bio.au.dk;ocaprani@daimi.au.dk

相似文献

1
Combining many multiple alignments in one improved alignment.将多个多序列比对合并为一个改进的比对。
Bioinformatics. 1999 Feb;15(2):122-30. doi: 10.1093/bioinformatics/15.2.122.
2
A polynomial time solvable formulation of multiple sequence alignment.多重序列比对的多项式时间可解公式化表述。
J Comput Biol. 2006 Mar;13(2):309-19. doi: 10.1089/cmb.2006.13.309.
3
Consensus shapes: an alternative to the Sankoff algorithm for RNA consensus structure prediction.共识形状:一种用于RNA共识结构预测的替代桑科夫算法的方法。
Bioinformatics. 2005 Sep 1;21(17):3516-23. doi: 10.1093/bioinformatics/bti577. Epub 2005 Jul 14.
4
DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment.DIALIGN 2:多序列比对中片段对片段方法的改进。
Bioinformatics. 1999 Mar;15(3):211-8. doi: 10.1093/bioinformatics/15.3.211.
5
A survey of multiple sequence comparison methods.多序列比对方法综述。
Bull Math Biol. 1992 Jul;54(4):563-98. doi: 10.1007/BF02459635.
6
Multiple sequence alignment using partial order graphs.使用偏序图的多序列比对。
Bioinformatics. 2002 Mar;18(3):452-64. doi: 10.1093/bioinformatics/18.3.452.
7
Murlet: a practical multiple alignment tool for structural RNA sequences.Murlet:一种用于结构RNA序列的实用多序列比对工具。
Bioinformatics. 2007 Jul 1;23(13):1588-98. doi: 10.1093/bioinformatics/btm146. Epub 2007 Apr 25.
8
SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.SATe-II:一种非常快速且准确的同时估计多个序列比对和系统发育树的方法。
Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1.
9
Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment.提高用于成对总和多序列比对的最短路径方法的实际空间和时间效率。
J Comput Biol. 1995 Fall;2(3):459-72. doi: 10.1089/cmb.1995.2.459.
10
Exact and complete short-read alignment to microbial genomes using Graphics Processing Unit programming.使用图形处理单元编程实现微生物基因组的精确和完整短读序列比对。
Bioinformatics. 2011 May 15;27(10):1351-8. doi: 10.1093/bioinformatics/btr151. Epub 2011 Mar 30.

引用本文的文献

1
TPMA: A two pointers meta-alignment tool to ensemble different multiple nucleic acid sequence alignments.TPMA:一种双指针元比对工具,用于集成不同的多个核酸序列比对。
PLoS Comput Biol. 2024 Apr 1;20(4):e1011988. doi: 10.1371/journal.pcbi.1011988. eCollection 2024 Apr.
2
Efficient representation of uncertainty in multiple sequence alignments using directed acyclic graphs.使用有向无环图对多序列比对中的不确定性进行有效表示。
BMC Bioinformatics. 2015 Apr 1;16:108. doi: 10.1186/s12859-015-0516-1.
3
ADLD: a novel graphical representation of protein sequences and its application.
ADLD:一种蛋白质序列的新型图形表示及其应用
Comput Math Methods Med. 2014;2014:959753. doi: 10.1155/2014/959753. Epub 2014 Oct 30.
4
MergeAlign: improving multiple sequence alignment performance by dynamic reconstruction of consensus multiple sequence alignments.MergeAlign:通过动态重建共识多重序列比对来提高多重序列比对性能。
BMC Bioinformatics. 2012 May 30;13:117. doi: 10.1186/1471-2105-13-117.
5
Accounting for alignment uncertainty in phylogenomics.系统发生基因组学中的排列不确定性校正。
PLoS One. 2012;7(1):e30288. doi: 10.1371/journal.pone.0030288. Epub 2012 Jan 17.
6
The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods.M-Coffee网络服务器:一种通过组合多种比对方法来计算多序列比对的元方法。
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W645-8. doi: 10.1093/nar/gkm333. Epub 2007 May 25.
7
M-Coffee: combining multiple sequence alignment methods with T-Coffee.M-Coffee:将多种多序列比对方法与T-Coffee相结合。
Nucleic Acids Res. 2006 Mar 23;34(6):1692-9. doi: 10.1093/nar/gkl091. Print 2006.
8
Automatic assessment of alignment quality.对齐质量的自动评估。
Nucleic Acids Res. 2005 Dec 16;33(22):7120-8. doi: 10.1093/nar/gki1020. Print 2005.
9
DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches.DbClustal:通过数据库搜索检测到的蛋白质序列的快速可靠全局多序列比对。
Nucleic Acids Res. 2000 Aug 1;28(15):2919-26. doi: 10.1093/nar/28.15.2919.