Chen Luonan, Wu Ling-Yun, Wang Yong, Zhang Shihua, Zhang Xiang-Sun
Institute of Systems Biology, Shanghai University, Shanghai 200444, China.
BMC Struct Biol. 2006 Sep 2;6:18. doi: 10.1186/1472-6807-6-18.
Protein structure comparison is one of the most important problems in computational biology and plays a key role in protein structure prediction, fold family classification, motif finding, phylogenetic tree reconstruction and protein docking.
We propose a novel method to compare the protein structures in an accurate and efficient manner. Such a method can be used to not only reveal divergent evolution, but also identify circular permutations and further detect active-sites. Specifically, we define the structure alignment as a multi-objective optimization problem, i.e., maximizing the number of aligned atoms and minimizing their root mean square distance. By controlling a single distance-related parameter, theoretically we can obtain a variety of optimal alignments corresponding to different optimal matching patterns, i.e., from a large matching portion to a small matching portion. The number of variables in our algorithm increases with the number of atoms of protein pairs in almost a linear manner. In addition to solid theoretical background, numerical experiments demonstrated significant improvement of our approach over the existing methods in terms of quality and efficiency. In particular, we show that divergent evolution, circular permutations and active-sites (or structural motifs) can be identified by our method. The software SAMO is available upon request from the authors, or from http://zhangroup.aporc.org/bioinfo/samo/ and http://intelligent.eic.osaka-sandai.ac.jp/chenen/samo.htm.
A novel formulation is proposed to accurately align protein structures in the framework of multi-objective optimization, based on a sequence order-independent strategy. A fast and accurate algorithm based on the bipartite matching algorithm is developed by exploiting the special features. Convergence of computation is shown in experiments and is also theoretically proven.
蛋白质结构比较是计算生物学中最重要的问题之一,在蛋白质结构预测、折叠家族分类、基序发现、系统发育树重建和蛋白质对接中起着关键作用。
我们提出了一种新颖的方法来准确高效地比较蛋白质结构。这种方法不仅可以用于揭示分歧进化,还可以识别环状排列并进一步检测活性位点。具体而言,我们将结构比对定义为一个多目标优化问题,即最大化对齐原子的数量并最小化它们的均方根距离。通过控制单个与距离相关的参数,理论上我们可以获得对应于不同最优匹配模式的各种最优比对,即从大匹配部分到小匹配部分。我们算法中的变量数量几乎以线性方式随着蛋白质对原子数量的增加而增加。除了坚实的理论背景外,数值实验表明我们的方法在质量和效率方面比现有方法有显著改进。特别是,我们表明我们的方法可以识别分歧进化、环状排列和活性位点(或结构基序)。软件SAMO可应作者要求获取,也可从http://zhangroup.aporc.org/bioinfo/samo/和http://intelligent.eic.osaka-sandai.ac.jp/chenen/samo.htm获取。
基于与序列顺序无关的策略,在多目标优化框架下提出了一种新颖的公式来准确比对蛋白质结构。通过利用特殊特征,开发了一种基于二分匹配算法的快速准确算法。实验表明计算收敛,并且在理论上也得到了证明。