Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Drive, Lawrence, KS 66047, USA.
Structure. 2010 Jul 14;18(7):858-67. doi: 10.1016/j.str.2010.04.007.
Protein template identification is essential to protein structure and function predictions. However, conventional whole-chain threading approaches often fail to recognize conserved substructure motifs when the target and templates do not share the same fold. We developed a new approach, SEGMER, for identifying protein substructure similarities by segmental threading. The target sequence is split into segments of two to four consecutive or nonconsecutive secondary structural elements, which are then threaded through PDB to identify appropriate substructure motifs. SEGMER is tested on 144 nonredundant hard proteins. When combined with whole-chain threading, the TM-score of alignments and accuracy of spatial restraints of SEGMER increase by 16% and 25%, respectively, compared with that by the whole-chain threading methods only. When tested on 12 free modeling targets from CASP8, SEGMER increases the TM-score and contact accuracy by 28% and 48%, respectively. This significant improvement should have important impact on protein structure modeling and functional inference.
蛋白质模板识别对于蛋白质结构和功能预测至关重要。然而,当目标序列和模板序列不具有相同的折叠结构时,传统的全链穿线方法往往无法识别保守的子结构模体。我们开发了一种新的方法 SEGMER,用于通过分段穿线识别蛋白质的子结构相似性。目标序列被分割成两个到四个连续或非连续的二级结构元素的片段,然后通过 PDB 进行穿线,以识别适当的子结构模体。SEGMER 在 144 个非冗余硬蛋白上进行了测试。当与全链穿线方法结合使用时,与仅使用全链穿线方法相比,SEGMER 对齐的 TM 分数和空间约束精度分别提高了 16%和 25%。当在 CASP8 中的 12 个自由建模目标上进行测试时,SEGMER 分别将 TM 分数和接触精度提高了 28%和 48%。这种显著的改进应该对蛋白质结构建模和功能推断产生重要影响。