Hoberman Rose, Klein-Seetharaman Judith, Rosenfeld Roni
School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA.
Appl Bioinformatics. 2004;3(2-3):167-79. doi: 10.2165/00822942-200403020-00011.
In this study, we attempt to understand and explain positional selection pressure in terms of underlying physical and chemical properties. We propose a set of constraining assumptions about how these pressures behave, then describe a procedure for analysing and explaining the distribution of residues at a particular position in a multiple sequence alignment. In contrast to previous approaches, our model takes into account both amino acid frequencies and a large number of physical-chemical properties. By analysing each property separately, it is possible to identify positions where distinct conservation patterns are present. In addition, the model can easily incorporate sequence weights that adjust for bias in the sample sequences. Finally, a test of statistical significance is provided for our conservation measure. The applicability of this method is demonstrated on two HIV-1 proteins: Nef and Env. The tools, data and results presented in this article are available at http://flan.blm.cs.cmu.edu.
在本研究中,我们试图从潜在的物理和化学性质方面理解和解释位置选择压力。我们提出了一组关于这些压力如何表现的约束性假设,然后描述了一种用于分析和解释多序列比对中特定位置残基分布的程序。与先前的方法不同,我们的模型同时考虑了氨基酸频率和大量的物理化学性质。通过分别分析每个性质,有可能识别出存在不同保守模式的位置。此外,该模型可以轻松纳入用于调整样本序列偏差的序列权重。最后,为我们的保守性度量提供了统计显著性检验。该方法的适用性在两种HIV-1蛋白:Nef和Env上得到了证明。本文中展示的工具、数据和结果可在http://flan.blm.cs.cmu.edu获取。