• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多序列比对的评估方法。

Evaluation measures of multiple sequence alignments.

作者信息

Gonnet G H, Korostensky C, Benner S

机构信息

Institute for Scientific Computing, ETH Zurich, Switzerland.

出版信息

J Comput Biol. 2000 Feb-Apr;7(1-2):261-76. doi: 10.1089/10665270050081513.

DOI:10.1089/10665270050081513
PMID:10890401
Abstract

Multiple sequence alignments (MSAs) are frequently used in the study of families of protein sequences or DNA/RNA sequences. They are a fundamental tool for the understanding of the structure, functionality and, ultimately, the evolution of proteins. A new algorithm, the Circular Sum (CS) method, is presented for formally evaluating the quality of an MSA. It is based on the use of a solution to the Traveling Salesman Problem, which identifies a circular tour through an evolutionary tree connecting the sequences in a protein family. With this approach, the calculation of an evolutionary tree and the errors that it would introduce can be avoided altogether. The algorithm gives an upper bound, the best score that can possibly be achieved by any MSA for a given set of protein sequences. Alternatively, if presented with a specific MSA, the algorithm provides a formal score for the MSA, which serves as an absolute measure of the quality of the MSA. The CS measure yields a direct connection between an MSA and the associated evolutionary tree. The measure can be used as a tool for evaluating different methods for producing MSAs. A brief example of the last application is provided. Because it weights all evolutionary events on a tree identically, but does not require the reconstruction of a tree, the CS algorithm has advantages over the frequently used sum-of-pairs measures for scoring MSAs, which weight some evolutionary events more strongly than others. Compared to other weighted sum-of-pairs measures, it has the advantage that no evolutionary tree must be constructed, because we can find a circular tour without knowing the tree.

摘要

多序列比对(MSA)在蛋白质序列家族或DNA/RNA序列的研究中经常被使用。它们是理解蛋白质结构、功能以及最终进化的基本工具。本文提出了一种新的算法,即循环和(CS)方法,用于正式评估MSA的质量。它基于旅行商问题的一种解决方案,该方案通过连接蛋白质家族中序列的进化树确定一条循环路径。通过这种方法,可以完全避免进化树的计算及其可能引入的误差。该算法给出了一个上限,即对于给定的一组蛋白质序列,任何MSA可能达到的最佳分数。或者,如果给出一个特定的MSA,该算法会为其提供一个正式分数,作为MSA质量的绝对度量。CS度量在MSA和相关进化树之间建立了直接联系。该度量可以用作评估生成MSA的不同方法的工具。最后给出了一个应用实例。由于它对树上所有进化事件的权重相同,但不需要重建树,因此CS算法相对于常用的成对和度量在评分MSA方面具有优势,后者对某些进化事件的权重比对其他事件更强。与其他加权成对和度量相比,它的优势在于无需构建进化树,因为我们可以在不知道树的情况下找到一条循环路径。

相似文献

1
Evaluation measures of multiple sequence alignments.多序列比对的评估方法。
J Comput Biol. 2000 Feb-Apr;7(1-2):261-76. doi: 10.1089/10665270050081513.
2
Bayesian coestimation of phylogeny and sequence alignment.系统发育与序列比对的贝叶斯联合估计
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.
3
Characterization of multiple sequence alignment errors using complete-likelihood score and position-shift map.使用完全似然得分和位置偏移图对多序列比对错误进行表征。
BMC Bioinformatics. 2016 Mar 18;17:133. doi: 10.1186/s12859-016-0945-5.
4
Simultaneous sequence alignment and tree construction using hidden Markov models.使用隐马尔可夫模型进行同步序列比对和树构建。
Pac Symp Biocomput. 2003:180-91.
5
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
6
Characterization of pairwise and multiple sequence alignment errors.成对和多序列比对错误的特征描述。
Gene. 2009 Jul 15;441(1-2):141-7. doi: 10.1016/j.gene.2008.05.016. Epub 2008 Jun 3.
7
Improving the practical space and time efficiency of the shortest-paths approach to sum-of-pairs multiple sequence alignment.提高用于成对总和多序列比对的最短路径方法的实际空间和时间效率。
J Comput Biol. 1995 Fall;2(3):459-72. doi: 10.1089/cmb.1995.2.459.
8
An alignment confidence score capturing robustness to guide tree uncertainty.一种对齐置信度评分,可捕捉对引导树不确定性的稳健性。
Mol Biol Evol. 2010 Aug;27(8):1759-67. doi: 10.1093/molbev/msq066. Epub 2010 Mar 5.
9
TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction.TCS:一种新的多重序列比对可靠性度量方法,用于估计比对准确性并改进系统发育树重建。
Mol Biol Evol. 2014 Jun;31(6):1625-37. doi: 10.1093/molbev/msu117. Epub 2014 Apr 1.
10
Class of multiple sequence alignment algorithm affects genomic analysis.多序列比对算法的类别会影响基因组分析。
Mol Biol Evol. 2013 Mar;30(3):642-53. doi: 10.1093/molbev/mss256. Epub 2012 Nov 9.

引用本文的文献

1
Multiple Alignment of Promoter Sequences from the L. Genome.从 L. 基因组中启动子序列的多重比对。
Genes (Basel). 2021 Jan 21;12(2):135. doi: 10.3390/genes12020135.
2
Evaluation of Trace Alignment Quality and its Application in Medical Process Mining.微量比对质量评估及其在医疗过程挖掘中的应用
Proc (IEEE Int Conf Healthc Inform). 2017 Aug;2017:258-267. doi: 10.1109/ICHI.2017.57. Epub 2017 Sep 14.
3
Assessing Activity Pattern Similarity with Multidimensional Sequence Alignment based on a Multiobjective Optimization Evolutionary Algorithm.
基于多目标优化进化算法的多维序列比对评估活动模式相似性
Geogr Anal. 2015 Jul;46(3):297-320. doi: 10.1111/gean.12040.
4
Estimates of positive Darwinian selection are inflated by errors in sequencing, annotation, and alignment.估计正达尔文选择受到测序、注释和比对错误的影响而膨胀。
Genome Biol Evol. 2009 Jun 5;1:114-8. doi: 10.1093/gbe/evp012.
5
Integrating protein structures and precomputed genealogies in the Magnum database: examples with cellular retinoid binding proteins.整合Magnum数据库中的蛋白质结构和预先计算的谱系:以细胞视黄醇结合蛋白为例。
BMC Bioinformatics. 2006 Feb 23;7:89. doi: 10.1186/1471-2105-7-89.
6
Gene fusions and gene duplications: relevance to genomic annotation and functional analysis.基因融合与基因重复:与基因组注释及功能分析的相关性
BMC Genomics. 2005 Mar 9;6:33. doi: 10.1186/1471-2164-6-33.