• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有误差界限保证的多序列比对的高效方法。

Efficient methods for multiple sequence alignment with guaranteed error bounds.

作者信息

Gusfield D

机构信息

Computer Science Division, University of California, Davis 95616-8755.

出版信息

Bull Math Biol. 1993 Jan;55(1):141-54. doi: 10.1007/BF02460299.

DOI:10.1007/BF02460299
PMID:7680269
Abstract

Multiple string (sequence) alignment is a difficult and important problem in computational biology, where it is central in two related tasks: finding highly conserved subregions or embedded patterns of a set of biological sequences (strings of DNA, RNA or amino acids), and inferring the evolutionary history of a set of taxa from their associated biological sequences. Several precise measures have been proposed for evaluating the goodness of a multiple alignment, but no efficient methods are known which compute the optimal alignment for any of these measures in any but small cases. In this paper, we consider two previously proposed measures, and give two computationaly efficient multiple alignment methods (one for each measure) whose deviation from the optimal value is guaranteed to be less than a factor of two. This is the novel feature of thse methods. but the methods have additional virtues as well. For both methods, the guaranteed bounds are much smaller than two when the number of strings is small (1.33 for three strings of any length); for one of the methods we give a related randomized method which is much faster and which gives, with high probability, multiple alignments with fairly small error bounds; and for the other measure, the method given yields a non-obvious lower bound on the optimal alignment.

摘要

多重序列比对是计算生物学中一个困难而重要的问题,它在两个相关任务中处于核心地位:找到一组生物序列(DNA、RNA或氨基酸序列)中高度保守的子区域或嵌入模式,以及从一组生物序列推断一组分类单元的进化历史。已经提出了几种精确的度量来评估多重比对的优劣,但除了小样本情况外,尚无有效的方法能够计算出这些度量中任何一种的最优比对。在本文中,我们考虑了两种先前提出的度量,并给出了两种计算效率高的多重比对方法(每种度量一种),其与最优值的偏差保证小于2倍。这是这些方法的新颖之处。但这些方法还有其他优点。对于这两种方法,当序列数量较少时(任意长度的三个序列时为1.33),保证的界限远小于2;对于其中一种方法,我们给出了一种相关的随机方法,该方法速度快得多,并且以高概率给出误差界限相当小的多重比对;对于另一种度量,所给出的方法给出了最优比对的一个非平凡下界。

相似文献

1
Efficient methods for multiple sequence alignment with guaranteed error bounds.具有误差界限保证的多序列比对的高效方法。
Bull Math Biol. 1993 Jan;55(1):141-54. doi: 10.1007/BF02460299.
2
Lower bounds on multiple sequence alignment using exact 3-way alignment.使用精确三元比对的多序列比对下限
BMC Bioinformatics. 2007 Apr 30;8:140. doi: 10.1186/1471-2105-8-140.
3
Multiple Sequence Alignment Based on a Suffix Tree and Center-Star Strategy: A Linear Method for Multiple Nucleotide Sequence Alignment on Spark Parallel Framework.基于后缀树和中心星策略的多序列比对:一种在Spark并行框架上进行多核苷酸序列比对的线性方法。
J Comput Biol. 2017 Dec;24(12):1230-1242. doi: 10.1089/cmb.2017.0040. Epub 2017 Nov 8.
4
An improved string composition method for sequence comparison.一种用于序列比较的改进型字符串组成方法。
BMC Bioinformatics. 2008 May 28;9 Suppl 6(Suppl 6):S15. doi: 10.1186/1471-2105-9-S6-S15.
5
Optimal sum-of-pairs multiple sequence alignment using incremental Carrillo and Lipman bounds.使用增量卡里略和利普曼边界的最优成对和多序列比对。
J Comput Biol. 2006 Apr;13(3):668-85. doi: 10.1089/cmb.2006.13.668.
6
A polynomial time solvable formulation of multiple sequence alignment.多重序列比对的多项式时间可解公式化表述。
J Comput Biol. 2006 Mar;13(2):309-19. doi: 10.1089/cmb.2006.13.309.
7
Glocal alignment: finding rearrangements during alignment.全局比对:比对过程中发现重排
Bioinformatics. 2003;19 Suppl 1:i54-62. doi: 10.1093/bioinformatics/btg1005.
8
Multiple sequence alignment with affine gap by using multi-objective genetic algorithm.使用多目标遗传算法进行带仿射间隙的多重序列比对。
Comput Methods Programs Biomed. 2014 Apr;114(1):38-49. doi: 10.1016/j.cmpb.2014.01.013. Epub 2014 Jan 31.
9
transAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences.transAlign:利用氨基酸促进蛋白质编码DNA序列的多重比对。
BMC Bioinformatics. 2005 Jun 22;6:156. doi: 10.1186/1471-2105-6-156.
10
Protein multiple sequence alignment benchmarking through secondary structure prediction.通过二级结构预测进行蛋白质多序列比对基准测试。
Bioinformatics. 2017 May 1;33(9):1331-1337. doi: 10.1093/bioinformatics/btw840.

引用本文的文献

1
VSEARCH: a versatile open source tool for metagenomics.VSEARCH:一款用于宏基因组学的多功能开源工具。
PeerJ. 2016 Oct 18;4:e2584. doi: 10.7717/peerj.2584. eCollection 2016.
2
A multiple-template approach to protein threading.一种多重模板的蛋白质穿线方法。
Proteins. 2011 Jun;79(6):1930-9. doi: 10.1002/prot.23016. Epub 2011 Apr 4.
3
A multi-template combination algorithm for protein comparative modeling.一种用于蛋白质比较建模的多模板组合算法。

本文引用的文献

1
Simultaneous comparison of three protein sequences.三种蛋白质序列的同步比较。
Proc Natl Acad Sci U S A. 1985 May;82(10):3073-7. doi: 10.1073/pnas.82.10.3073.
2
Multiple sequence alignment.多序列比对
J Mol Biol. 1986 Sep 20;191(2):153-61. doi: 10.1016/0022-2836(86)90252-4.
3
Multiple sequence alignment by consensus.通过一致性进行多序列比对。
BMC Struct Biol. 2008 Mar 17;8:18. doi: 10.1186/1472-6807-8-18.
4
Lower bounds on multiple sequence alignment using exact 3-way alignment.使用精确三元比对的多序列比对下限
BMC Bioinformatics. 2007 Apr 30;8:140. doi: 10.1186/1471-2105-8-140.
5
A combinatorial optimization approach for diverse motif finding applications.一种用于多种基序发现应用的组合优化方法。
Algorithms Mol Biol. 2006 Aug 17;1:13. doi: 10.1186/1748-7188-1-13.
6
A memory-efficient algorithm for multiple sequence alignment with constraints.一种用于带约束条件的多序列比对的内存高效算法。
Bioinformatics. 2005 Jan 1;21(1):20-30. doi: 10.1093/bioinformatics/bth468. Epub 2004 Sep 16.
7
Multiple structural alignment by secondary structures: algorithm and applications.基于二级结构的多重结构比对:算法与应用
Protein Sci. 2003 Nov;12(11):2492-507. doi: 10.1110/ps.03200603.
8
Hidden Markov models of biological primary sequence information.生物一级序列信息的隐马尔可夫模型
Proc Natl Acad Sci U S A. 1994 Feb 1;91(3):1059-63. doi: 10.1073/pnas.91.3.1059.
Nucleic Acids Res. 1986 Nov 25;14(22):9095-102. doi: 10.1093/nar/14.22.9095.
4
Progressive sequence alignment as a prerequisite to correct phylogenetic trees.渐进序列比对是构建正确系统发育树的前提条件。
J Mol Evol. 1987;25(4):351-60. doi: 10.1007/BF02603120.
5
A method for the simultaneous alignment of three or more amino acid sequences.一种用于同时比对三个或更多氨基酸序列的方法。
J Mol Evol. 1986;23(3):267-78. doi: 10.1007/BF02115583.
6
A tool for multiple sequence alignment.一种用于多序列比对的工具。
Proc Natl Acad Sci U S A. 1989 Jun;86(12):4412-5. doi: 10.1073/pnas.86.12.4412.
7
Gap costs for multiple sequence alignment.
J Theor Biol. 1989 Jun 8;138(3):297-309. doi: 10.1016/s0022-5193(89)80196-1.
8
A workbench for multiple alignment construction and analysis.用于多序列比对构建与分析的工作台。
Proteins. 1991;9(3):180-90. doi: 10.1002/prot.340090304.