Wallner Björn, Elofsson Arne
Stockholm Bioinformatics Center, SCFAB, Stockholm University, SE-106 91 Stockholm, Sweden.
Protein Sci. 2003 May;12(5):1073-86. doi: 10.1110/ps.0236803.
The ability to separate correct models of protein structures from less correct models is of the greatest importance for protein structure prediction methods. Several studies have examined the ability of different types of energy function to detect the native, or native-like, protein structure from a large set of decoys. In contrast to earlier studies, we examine here the ability to detect models that only show limited structural similarity to the native structure. These correct models are defined by the existence of a fragment that shows significant similarity between this model and the native structure. It has been shown that the existence of such fragments is useful for comparing the performance between different fold recognition methods and that this performance correlates well with performance in fold recognition. We have developed ProQ, a neural-network-based method to predict the quality of a protein model that extracts structural features, such as frequency of atom-atom contacts, and predicts the quality of a model, as measured either by LGscore or MaxSub. We show that ProQ performs at least as well as other measures when identifying the native structure and is better at the detection of correct models. This performance is maintained over several different test sets. ProQ can also be combined with the Pcons fold recognition predictor (Pmodeller) to increase its performance, with the main advantage being the elimination of a few high-scoring incorrect models. Pmodeller was successful in CASP5 and results from the latest LiveBench, LiveBench-6, indicating that Pmodeller has a higher specificity than Pcons alone.
对于蛋白质结构预测方法而言,将正确的蛋白质结构模型与不太正确的模型区分开来的能力至关重要。已有多项研究考察了不同类型能量函数从大量诱饵结构中检测天然或类天然蛋白质结构的能力。与早期研究不同,我们在此考察检测那些仅与天然结构呈现有限结构相似性的模型的能力。这些正确模型由一个片段的存在来定义,该片段在此模型与天然结构之间呈现出显著相似性。研究表明,此类片段的存在对于比较不同折叠识别方法之间的性能很有用,并且这种性能与折叠识别中的性能密切相关。我们开发了ProQ,这是一种基于神经网络的方法,用于预测蛋白质模型的质量,它提取诸如原子 - 原子接触频率等结构特征,并根据LGscore或MaxSub来预测模型的质量。我们表明,在识别天然结构时,ProQ的表现至少与其他方法一样好,并且在检测正确模型方面更出色。这种性能在几个不同的测试集上都得以保持。ProQ还可以与Pcons折叠识别预测器(Pmodeller)相结合以提高其性能,主要优势在于消除了一些高分的错误模型。Pmodeller在CASP5以及最新的LiveBench(LiveBench - 6)中取得了成功,这表明Pmodeller比单独的Pcons具有更高的特异性。