Livingstone C D, Barton G J
Laboratory of Molecular Biophysics, University of Oxford, UK.
Comput Appl Biosci. 1993 Dec;9(6):745-56. doi: 10.1093/bioinformatics/9.6.745.
An algorithm is described for the systematic characterization of the physico-chemical properties seen at each position in a multiple protein sequence alignment. The new algorithm allows questions important in the design of mutagenesis experiments to be quickly answered since positions in the alignment that show unusual or interesting residue substitution patterns may be rapidly identified. The strategy is based on a flexible set-based description of amino acid properties, which is used to define the conservation between any group of amino acids. Sequences in the alignment are gathered into subgroups on the basis of sequence similarity, functional, evolutionary or other criteria. All pairs of subgroups are then compared to highlight positions that confer the unique features of each subgroup. The algorithm is encoded in the computer program AMAS (Analysis of Multiply Aligned Sequences) which provides a textual summary of the analysis and an annotated (boxed, shaded and/or coloured) multiple sequence alignment. The algorithm is illustrated by application to an alignment of 67 SH2 domains where patterns of conserved hydrophobic residues that constitute the protein core are highlighted. The analysis of charge conservation across annexin domains identifies the locations at which conserved charges change sign. The algorithm simplifies the analysis of multiple sequence data by condensing the mass of information present, and thus allows the rapid identification of substitutions of structural and functional importance.
本文描述了一种算法,用于系统地表征多序列比对中每个位置的物理化学性质。这种新算法能够快速回答在诱变实验设计中重要的问题,因为可以迅速识别比对中显示异常或有趣残基替代模式的位置。该策略基于对氨基酸性质的灵活集合描述,用于定义任意氨基酸组之间的保守性。比对中的序列根据序列相似性、功能、进化或其他标准聚集成亚组。然后比较所有亚组对,以突出赋予每个亚组独特特征的位置。该算法编码在计算机程序AMAS(多重比对序列分析)中,该程序提供分析的文本摘要和带注释(框选、阴影和/或着色)的多序列比对。通过应用于67个SH2结构域序列的比对来说明该算法,其中突出显示了构成蛋白质核心的保守疏水残基模式。对膜联蛋白结构域的电荷保守性分析确定了保守电荷改变符号的位置。该算法通过浓缩现有大量信息简化了多序列数据的分析,从而能够快速识别具有结构和功能重要性的替代。