Ishida Takashi, Nishimura Takeshi, Nozaki Makoto, Inoue Tsuyoshi, Terada Tohru, Nakamura Shugo, Shimizu Kentaro
Department of Biotechnology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan.
Genome Inform. 2003;14:228-37.
An ab initio protein structure prediction system called ABLE is described. It is based on the fragment assembly method, which consists of two steps: dividing a target sequence into overlapping subsequences (fragments) of short length and assigning a local structure to each fragment; and generating models by assembling the local structures and selecting the models with low potential energy. One of the most important problems in conventional fragment assembly methods is the difficulty of selecting native-like structures by energy minimization only. ABLE thus employs a structural clustering method to select the native-like models from among the generated models. By applying the unit-vector root mean square distance (URMS) as a measure of structure similarity, we achieve more robust, effective structural clustering. When no enough clusters of good quality are obtained, ABLE runs the energy minimization procedure again by incorporating structural restraint conditions obtained from the consensus substructures in the previously generated models. This approach is based on our observation that there is a high probability that the consensus substructures of the generated models have native-like structures. Another feature of ABLE is that in assigning local structures to fragments, it assigns mainchain dihedral angles (phi, psi) to the central residue of each fragment according to a probability distribution map built from candidate sequences similar to each fragment. This enables the system to generate appropriate local structures that may not already exist in a protein structure database. We applied our system to 25 small proteins and obtain near-native folds for more than half of them. We also demonstrate the performance of our structural clustering method, which can be applied to other protein structure prediction systems.
本文描述了一种名为ABLE的从头算蛋白质结构预测系统。它基于片段组装方法,该方法包括两个步骤:将目标序列划分为短长度的重叠子序列(片段),并为每个片段分配局部结构;通过组装局部结构并选择势能低的模型来生成模型。传统片段组装方法中最重要的问题之一是仅通过能量最小化来选择类似天然结构的难度。因此,ABLE采用结构聚类方法从生成的模型中选择类似天然的模型。通过应用单位向量均方根距离(URMS)作为结构相似性的度量,我们实现了更稳健、有效的结构聚类。当没有获得足够数量的高质量聚类时,ABLE通过纳入从先前生成的模型中的共有子结构获得的结构约束条件,再次运行能量最小化程序。这种方法基于我们的观察,即生成的模型的共有子结构具有类似天然结构的可能性很高。ABLE的另一个特点是,在为片段分配局部结构时,它根据从与每个片段相似的候选序列构建的概率分布图,为每个片段的中心残基分配主链二面角(φ,ψ)。这使得系统能够生成蛋白质结构数据库中可能不存在的适当局部结构。我们将我们的系统应用于25个小蛋白质,其中一半以上获得了接近天然的折叠。我们还展示了我们的结构聚类方法的性能,该方法可应用于其他蛋白质结构预测系统。