Vassura Marco, Margara Luciano, Di Lena Pietro, Medri Filippo, Fariselli Piero, Casadio Rita
Department of Computer Science, University of Bologna, Via Mura Anteo Zamboni 7, 40127 Bologna, Italy.
IEEE/ACM Trans Comput Biol Bioinform. 2008 Jul-Sep;5(3):357-67. doi: 10.1109/TCBB.2008.27.
The prediction of the protein tertiary structure from solely its residue sequence (the so called Protein Folding Problem) is one of the most challenging problems in Structural Bioinformatics. We focus on the protein residue contact map. When this map is assigned it is possible to reconstruct the 3D structure of the protein backbone. The general problem of recovering a set of 3D coordinates consistent with some given contact map is known as a unit-disk-graph realization problem and it has been recently proven to be NP-Hard. In this paper we describe a heuristic method (COMAR) that is able to reconstruct with an unprecedented rate (3-15 seconds) a 3D model that exactly matches the target contact map of a protein. Working with a non-redundant set of 1760 proteins, we find that the scoring efficiency of finding a 3D model very close to the protein native structure depends on the threshold value adopted to compute the protein residue contact map. Contact maps whose threshold values range from 10 to 18 Angstroms allow reconstructing 3D models that are very similar to the proteins native structure.
仅根据蛋白质的残基序列预测其三级结构(即所谓的蛋白质折叠问题)是结构生物信息学中最具挑战性的问题之一。我们专注于蛋白质残基接触图。当确定了这一接触图后,就有可能重建蛋白质主链的三维结构。恢复与给定接触图一致的一组三维坐标的一般问题被称为单位圆盘图实现问题,最近已被证明是NP难问题。在本文中,我们描述了一种启发式方法(COMAR),它能够以前所未有的速度(3 - 15秒)重建与蛋白质目标接触图完全匹配的三维模型。通过处理一组由1760种蛋白质组成的非冗余数据集,我们发现找到一个非常接近蛋白质天然结构的三维模型的评分效率取决于用于计算蛋白质残基接触图的阈值。阈值范围在10到18埃之间的接触图能够重建与蛋白质天然结构非常相似的三维模型。