key Laboratory of Intelligent Computing & Information Processing, Ministry of Education, Xiangtan University, Xiangtan, China.
College of Chemistry, Xiangtan University, Xiangtan, China.
BMC Bioinformatics. 2020 Apr 25;21(1):157. doi: 10.1186/s12859-020-3499-5.
Quality assessment of protein tertiary structure prediction models, in which structures of the best quality are selected from decoys, is a major challenge in protein structure prediction, and is crucial to determine a model's utility and potential applications. Estimating the quality of a single model predicts the model's quality based on the single model itself. In general, the Pearson correlation value of the quality assessment method increases in tandem with an increase in the quality of the model pool. However, there is no consensus regarding the best method to select a few good models from the poor quality model pool.
We introduce a novel single-model quality assessment method for poor quality models that uses simple linear combinations of six features. We perform weighted search and linear regression on a large dataset of models from the 12th Critical Assessment of Protein Structure Prediction (CASP12) and benchmark the results on CASP13 models. We demonstrate that our method achieves outstanding performance on poor quality models.
According to results of poor protein structure assessment based on six features, contact prediction and relying on fewer prediction features can improve selection accuracy.
从 decoys 中选择最佳质量的蛋白质三级结构预测模型的质量评估是蛋白质结构预测的主要挑战,对于确定模型的实用性和潜在应用至关重要。估计单个模型的质量是根据单个模型本身来预测模型的质量。通常,质量评估方法的 Pearson 相关值随着模型库质量的提高而同步提高。然而,对于从质量较差的模型库中选择少量优质模型的最佳方法,尚无共识。
我们引入了一种新颖的单模型质量评估方法,用于质量较差的模型,该方法使用六个特征的简单线性组合。我们对来自第 12 届蛋白质结构预测关键评估(CASP12)的大量模型数据集进行加权搜索和线性回归,并在 CASP13 模型上进行基准测试。我们证明了我们的方法在质量较差的模型上表现出色。
根据基于六个特征的不良蛋白质结构评估结果,接触预测和依赖较少的预测特征可以提高选择准确性。