Mihalek I, Res I, Lichtarge O
Department of Molecular and Human Genetics, Baylor College of Medicine One Baylor Plaza, Houston, TX 77030, USA.
Bioinformatics. 2006 Jan 15;22(2):149-56. doi: 10.1093/bioinformatics/bti791. Epub 2005 Nov 22.
Various multiple sequence alignment-based methods have been proposed to detect functional surfaces in proteins, such as active sites or protein interfaces. The effect that the choice of sequences has on the conclusions of such analysis has seldom been discussed. In particular, no method has been discussed in terms of its ability to optimize the sequence selection for the reliable detection of functional surfaces.
Here we propose, for the case of proteins with known structure, a heuristic Metropolis Monte Carlo strategy to select sequences from a large set of homologues, in order to improve detection of functional surfaces. The quantity guiding the optimization is the clustering of residues which are under increased evolutionary pressure, according to the sample of sequences under consideration. We show that we can either improve the overlap of our prediction with known functional surfaces in comparison with the sequence similarity criteria of selection or match the quality of prediction obtained through more elaborate non-structure based-methods of sequence selection. For the purpose of demonstration we use a set of 50 homodimerizing enzymes which were co-crystallized with their substrates and cofactors.
已经提出了各种基于多序列比对的方法来检测蛋白质中的功能表面,例如活性位点或蛋白质界面。序列选择对这种分析结论的影响很少被讨论。特别是,还没有一种方法在优化序列选择以可靠检测功能表面的能力方面得到讨论。
在这里,对于具有已知结构的蛋白质,我们提出了一种启发式的 metropolis 蒙特卡罗策略,从大量同源物中选择序列,以改进功能表面的检测。指导优化的量是根据所考虑的序列样本,处于增加的进化压力下的残基的聚类。我们表明,与选择的序列相似性标准相比,我们可以提高预测与已知功能表面的重叠,或者与通过更精细的基于非结构的序列选择方法获得的预测质量相匹配。为了说明这一点,我们使用了一组50种与它们的底物和辅因子共结晶的同二聚化酶。