Stanfel L E
University of Alabama, Tuscaloosa 35487-0226, USA.
J Theor Biol. 1996 Nov 21;183(2):195-205. doi: 10.1006/jtbi.1996.0213.
Each amino acid is represented by a vector of numerical measurements for the attributes of volume, area, hydrophilicity, polarity, hydrogen bonding, shape, and charge. Inter-residue distances are then calculated according to common metrics, and we introduce a new clustering objective function derived from information-theoretic considerations. The arguments of the function are the inter-object distances of the things to be clustered: in this case the amino acids. By means of approximating the solution of an integer programming problem, then, the residues are partitioned into clusters. The clusters obtained are compared with groups obtained in substitution/mutation studies and found to be similar. Thus, probably the strongest and most objective evidence to date is supplied for believing that physico-chemical properties account for the viability of substitutions and that the important similarities/differences are explained by a relatively small and simple set of properties.
每个氨基酸都由一个数值测量向量表示,该向量用于描述体积、面积、亲水性、极性、氢键、形状和电荷等属性。然后根据常见度量计算残基间距离,并且我们引入了一个基于信息论考量得出的新聚类目标函数。该函数的自变量是待聚类事物(在这种情况下是氨基酸)的对象间距离。通过近似求解整数规划问题,残基被划分为不同的簇。将得到的簇与在替换/突变研究中获得的组进行比较,发现它们相似。因此,可能提供了迄今为止最有力且最客观的证据,让人相信物理化学性质决定了替换的可行性,并且重要的相似性/差异可以由一组相对较少且简单的性质来解释。