Robles Víctor, Larrañaga Pedro, Peña José M, Menasalvas Ernestina, Pérez María S, Herves Vanessa, Wasilewska Anita
Department of Computer Architecture and Technology, Technical University of Madrid, Madrid, Spain.
Artif Intell Med. 2004 Jun;31(2):117-36. doi: 10.1016/j.artmed.2004.01.009.
Successful secondary structure predictions provide a starting point for direct tertiary structure modelling, and also can significantly improve sequence analysis and sequence-structure threading for aiding in structure and function determination. Hence the improvement of predictive accuracy of the secondary structure prediction becomes essential for future development of the whole field of protein research. In this work we present several multi-classifiers that combine the predictions of the best current classifiers available on Internet. Our results prove that combining the predictions of a set of classifiers by creating composite classifiers is a fruitful one. We have created multi-classifiers that are more accurate than any of the component classifiers. The multi-classifiers are based on Bayesian networks. They are validated with 9 different datasets. Their predictive accuracy results outperform the best secondary structure predictors by 1.21% on average. Our main contributions are: (i) we improved the best know predictive accuracy by 1.21%, (ii) our best results have been obtained with a new semi naïve Bayes approach named Pazzani-EDA and (iii) our multi-classifiers combine results of previously build classifiers predictions obtained through Internet, thanks to our development of a Java application.
成功的二级结构预测为直接的三级结构建模提供了一个起点,并且还可以显著改进序列分析和序列-结构穿线法,以辅助确定结构和功能。因此,提高二级结构预测的准确性对于蛋白质研究整个领域的未来发展至关重要。在这项工作中,我们提出了几个多分类器,它们结合了互联网上现有最佳当前分类器的预测结果。我们的结果证明,通过创建复合分类器来组合一组分类器的预测是富有成效的。我们创建的多分类器比任何单个组成分类器都更准确。这些多分类器基于贝叶斯网络。它们使用9个不同的数据集进行了验证。它们的预测准确性结果平均比最佳二级结构预测器高出1.21%。我们的主要贡献包括:(i)我们将已知的最佳预测准确性提高了1.21%,(ii)我们通过一种名为Pazzani-EDA的新的半朴素贝叶斯方法获得了最佳结果,以及(iii)由于我们开发了一个Java应用程序,我们的多分类器结合了通过互联网获得的先前构建的分类器预测结果。