• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

氨基酸序列比对:常用方法比较

Aligning amino acid sequences: comparison of commonly used methods.

作者信息

Feng D F, Johnson M S, Doolittle R F

出版信息

J Mol Evol. 1984;21(2):112-25. doi: 10.1007/BF02100085.

DOI:10.1007/BF02100085
PMID:6100188
Abstract

We examined two extensive families of protein sequences using four different alignment schemes that employ various degrees of "weighting" in order to determine which approach is most sensitive in establishing relationships. All alignments used a similarity approach based on a general algorithm devised by Needleman and Wunsch. The approaches included a simple program, UM (unitary matrix), whereby only identities are scored; a scheme in which the genetic code is used as a basis for weighting (GC); another that employs a matrix based on structural similarity of amino acids taken together with the genetic basis of mutation (SG); and a fourth that uses the empirical log-odds matrix (LOM) developed by Dayhoff on the basis of observed amino acid replacements. The two sequence families examined were (a) nine different globins and (b) nine different tyrosine kinase-like proteins. It was assumed a priori that all members of a family share common ancestry. In cases where two sequences were more than 30% identical, alignments by all four methods were almost always the same. In cases where the percentage identity was less than 20%, however, there were often significant differences in the alignments. On the average, the Dayhoff LOM approach was the most effective in verifying distant relationships, as judged by an empirical "jumbling test." This was not universally the case, however, and in some instances the simple UM was actually as good or better. Trees constructed on the basis of the various alignments differed with regard to their limb lengths, but had essentially the same branching orders. We suggest some reasons for the different effectivenesses of the four approaches in the two different sequence settings, and offer some rules of thumb for assessing the significance of sequence relationships.

摘要

我们使用四种不同的比对方案研究了两个庞大的蛋白质序列家族,这些方案采用了不同程度的“加权”,以确定哪种方法在建立关系时最敏感。所有比对都使用了基于Needleman和Wunsch设计的通用算法的相似性方法。这些方法包括一个简单的程序,UM(单位矩阵),只对相同性进行评分;一种以遗传密码为加权基础的方案(GC);另一种采用基于氨基酸结构相似性与突变遗传基础相结合的矩阵(SG);以及第四种使用Dayhoff根据观察到的氨基酸替换情况开发的经验对数似然矩阵(LOM)。所研究的两个序列家族分别是:(a)九种不同的珠蛋白和(b)九种不同的酪氨酸激酶样蛋白。事先假定一个家族的所有成员都有共同的祖先。在两条序列的相同性超过30%的情况下,所有四种方法的比对结果几乎总是相同的。然而,在相同性百分比低于20%的情况下,比对结果往往存在显著差异。平均而言,根据经验性的“重排测试”判断,Dayhoff LOM方法在验证远缘关系方面最有效。然而,情况并非总是如此,在某些情况下,简单的UM实际上同样有效或更好。根据各种比对构建的树在分支长度方面有所不同,但基本分支顺序相同。我们提出了四种方法在两种不同序列背景下有效性不同的一些原因,并提供了一些评估序列关系重要性的经验法则。

相似文献

1
Aligning amino acid sequences: comparison of commonly used methods.氨基酸序列比对:常用方法比较
J Mol Evol. 1984;21(2):112-25. doi: 10.1007/BF02100085.
2
Using CLUSTAL for multiple sequence alignments.使用CLUSTAL进行多序列比对。
Methods Enzymol. 1996;266:383-402. doi: 10.1016/s0076-6879(96)66024-8.
3
Progressive sequence alignment as a prerequisite to correct phylogenetic trees.渐进序列比对是构建正确系统发育树的前提条件。
J Mol Evol. 1987;25(4):351-60. doi: 10.1007/BF02603120.
4
Profile analysis: detection of distantly related proteins.轮廓分析:检测远亲相关蛋白。
Proc Natl Acad Sci U S A. 1987 Jul;84(13):4355-8. doi: 10.1073/pnas.84.13.4355.
5
A method for the simultaneous alignment of three or more amino acid sequences.一种用于同时比对三个或更多氨基酸序列的方法。
J Mol Evol. 1986;23(3):267-78. doi: 10.1007/BF02115583.
6
Hidden Markov models of biological primary sequence information.生物一级序列信息的隐马尔可夫模型
Proc Natl Acad Sci U S A. 1994 Feb 1;91(3):1059-63. doi: 10.1073/pnas.91.3.1059.
7
A method for detecting distant evolutionary relationships between protein or nucleic acid sequences in the presence of deletions or insertions.一种在存在缺失或插入的情况下检测蛋白质或核酸序列之间远距离进化关系的方法。
J Mol Evol. 1978 Jun 20;11(2):143-61. doi: 10.1007/BF01733890.
8
A novel randomized iterative strategy for aligning multiple protein sequences.一种用于比对多条蛋白质序列的新型随机迭代策略。
Comput Appl Biosci. 1991 Oct;7(4):479-84. doi: 10.1093/bioinformatics/7.4.479.
9
An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited.蛋白质序列比对中氨基酸交换矩阵的评估:重温模糊区域
J Mol Biol. 1995 Jun 16;249(4):816-31. doi: 10.1006/jmbi.1995.0340.
10
Three-way Needleman--Wunsch algorithm.三路Needleman-Wunsch算法。
Methods Enzymol. 1990;183:365-75. doi: 10.1016/0076-6879(90)83024-4.

引用本文的文献

1
Characterization on the oncogenic effect of the missense mutations of p53 via machine learning.基于机器学习的 p53 错义突变致癌效应的特征分析。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad428.
2
Construction and characterization of an infectious cDNA clone of potato virus S developed from selected populations that survived genetic bottlenecks.从经历遗传瓶颈的选择群体中分离得到的马铃薯 S 病毒 cDNA 克隆的构建与鉴定。
Virol J. 2019 Feb 6;16(1):18. doi: 10.1186/s12985-019-1124-x.
3
Guiding the humoral response against HIV-1 toward a MPER adjacent region by immunization with a VLP-formulated antibody-selected envelope variant.

本文引用的文献

1
Comparative biosequence metrics.比较生物序列度量
J Mol Evol. 1981;18(1):38-46. doi: 10.1007/BF01733210.
2
Similar amino acid sequences: chance or common ancestry?相似的氨基酸序列:偶然因素还是共同祖先?
Science. 1981 Oct 9;214(4517):149-59. doi: 10.1126/science.7280687.
3
An improved algorithm for matching biological sequences.一种用于匹配生物序列的改进算法。
通过免疫接种带有 VLP 制剂的抗体选择的包膜变体,引导针对 HIV-1 的体液反应朝向 MPER 相邻区域。
PLoS One. 2018 Dec 19;13(12):e0208345. doi: 10.1371/journal.pone.0208345. eCollection 2018.
4
Differential Shape of Geminivirus Mutant Spectra Across Cultivated and Wild Hosts With Invariant Viral Consensus Sequences.双生病毒突变谱在具有不变病毒共有序列的栽培宿主和野生宿主间的差异形状
Front Plant Sci. 2018 Jul 2;9:932. doi: 10.3389/fpls.2018.00932. eCollection 2018.
5
Lethal mutagenesis of an RNA plant virus via lethal defection.通过致死性缺陷使 RNA 植物病毒发生致死性突变。
Sci Rep. 2018 Jan 23;8(1):1444. doi: 10.1038/s41598-018-19829-6.
6
IBBOMSA: An Improved Biogeography-based Approach for Multiple Sequence Alignment.IBBOMSA:一种用于多序列比对的改进的基于生物地理学的方法。
Evol Bioinform Online. 2016 Oct 27;12:237-246. doi: 10.4137/EBO.S40457. eCollection 2016.
7
Positive selection in the SLC11A1 gene in the family Equidae.马科动物SLC11A1基因中的正向选择。
Immunogenetics. 2016 May;68(5):353-64. doi: 10.1007/s00251-016-0905-2. Epub 2016 Feb 4.
8
Identification and Characterization of a G Protein-binding Cluster in α7 Nicotinic Acetylcholine Receptors.α7烟碱型乙酰胆碱受体中G蛋白结合簇的鉴定与表征
J Biol Chem. 2015 Aug 14;290(33):20060-70. doi: 10.1074/jbc.M115.647040. Epub 2015 Jun 18.
9
A statistical physics perspective on alignment-independent protein sequence comparison.基于统计物理学视角的非比对蛋白质序列比较
Bioinformatics. 2015 Aug 1;31(15):2469-74. doi: 10.1093/bioinformatics/btv167. Epub 2015 Mar 25.
10
The long and winding road of molecular data in phylogenetic analysis.系统发育分析中分子数据的漫长而曲折之路。
J Hist Biol. 2014 Fall;47(3):443-78.
J Mol Biol. 1982 Dec 15;162(3):705-8. doi: 10.1016/0022-2836(82)90398-9.
4
The amino acid sequence of a major polypeptide chain of earthworm hemoglobin.蚯蚓血红蛋白一条主要多肽链的氨基酸序列。
J Biol Chem. 1982 Aug 10;257(15):9005-15.
5
An examination of the expected degree of sequence similarity that might arise in proteins that have converged to similar conformational states. The impact of such expectations on the search for homology between the structurally similar domains of rhodanese.对可能在已趋同至相似构象状态的蛋白质中出现的预期序列相似程度的考察。此类预期对寻找硫氧还蛋白结构相似结构域之间同源性的影响。
J Mol Biol. 1981 Sep 5;151(1):179-97. doi: 10.1016/0022-2836(81)90227-8.
6
Establishing homologies in protein sequences.确定蛋白质序列中的同源性。
Methods Enzymol. 1983;91:524-45. doi: 10.1016/s0076-6879(83)91049-2.
7
Amino acid sequence of dimeric myoglobin from Cerithidea rhizophorarum.来自红树沼螺的二聚体肌红蛋白的氨基酸序列。
Biochim Biophys Acta. 1983 May 30;745(1):32-6. doi: 10.1016/0167-4838(83)90166-8.
8
Nucleotide sequence of the feline retroviral oncogene v-fms shows unexpected homology with oncogenes encoding tyrosine-specific protein kinases.猫逆转录病毒致癌基因v-fms的核苷酸序列与编码酪氨酸特异性蛋白激酶的致癌基因显示出意外的同源性。
Proc Natl Acad Sci U S A. 1984 Jan;81(1):85-9. doi: 10.1073/pnas.81.1.85.
9
Primary structure homology between the product of yeast cell division control gene CDC28 and vertebrate oncogenes.酵母细胞分裂控制基因CDC28的产物与脊椎动物癌基因之间的一级结构同源性。
Nature. 1984;307(5947):183-5. doi: 10.1038/307183a0.
10
Nucleotide sequence of v-rel: the oncogene of reticuloendotheliosis virus.v-rel的核苷酸序列:网状内皮增生症病毒的癌基因。
Proc Natl Acad Sci U S A. 1983 Oct;80(20):6229-33. doi: 10.1073/pnas.80.20.6229.