Zhang C T, Zhang R
Department of Physics, Tianjin University, China.
J Biomol Struct Dyn. 2000 Apr;17(5):829-42. doi: 10.1080/07391102.2000.10506572.
Algorithms of secondary structure prediction have undergone the developments of nearly 30 years. However, the problem of how to appropriately evaluate and compare algorithms has not yet completely solved. A graphic method to evaluate algorithms of secondary structure prediction has been proposed here. Traditionally, the performance of an algorithm is evaluated by a number, i.e., accuracy of various definitions. Instead of a number, we use a graph to completely evaluate an algorithm, in which the mapping points are distributed in a three-dimensional space. Each point represents the predictive result of the secondary structure of a protein. Because the distribution of mapping points in the 3D space generally contains more information than a number or a set of numbers, it is expected that algorithms may be evaluated and compared by the proposed graphic method more objectively. Based on the point distribution, six evaluation parameters are proposed, which describe the overall performance of the algorithm evaluated. Furthermore, the graphic method is simple and intuitive. As an example of application, two advanced algorithms, i.e., the PHD and NNpredict methods, are evaluated and compared. It is shown that there is still much room for further improvement for both algorithms. It is pointed out that the accuracy for predicting either the alpha-helix or beta-strand in proteins with higher alpha-helix or beta-strand content, respectively, should be greatly improved for both algorithms.
二级结构预测算法已经历了近30年的发展。然而,如何恰当地评估和比较算法这一问题尚未得到完全解决。本文提出了一种评估二级结构预测算法的图形方法。传统上,算法的性能是通过一个数字来评估的,即各种定义的准确率。我们不是用一个数字,而是用一个图形来全面评估算法,其中映射点分布在三维空间中。每个点代表一种蛋白质二级结构的预测结果。由于三维空间中映射点的分布通常比一个数字或一组数字包含更多信息,因此有望通过所提出的图形方法更客观地评估和比较算法。基于点的分布,提出了六个评估参数,这些参数描述了所评估算法的整体性能。此外,该图形方法简单直观。作为应用实例,对两种先进算法,即PHD和NNpredict方法进行了评估和比较。结果表明,这两种算法仍有很大的改进空间。指出对于这两种算法,分别预测具有较高α-螺旋或β-链含量的蛋白质中的α-螺旋或β-链时的准确率都应大幅提高。