Nanni Loris, Lumini Alessandra
DEIS, Università di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy.
Amino Acids. 2009 Mar;36(3):409-16. doi: 10.1007/s00726-008-0076-z. Epub 2008 Apr 10.
The focus of this work is the use of ensembles of classifiers for predicting HIV protease cleavage sites in proteins. Due to the complex relationships in the biological data, several recent works show that often ensembles of learning algorithms outperform stand-alone methods. We show that the fusion of approaches based on different encoding models can be useful for improving the performance of this classification problem. In particular, in this work four different feature encodings for peptides are described and tested. An extensive evaluation on a large dataset according to a blind testing protocol is reported which demonstrates how different feature extraction methods and classifiers can be combined for obtaining a robust and reliable system. The comparison with other stand-alone approaches allows quantifying the performance improvement obtained by the ensembles proposed in this work.
这项工作的重点是使用分类器集成来预测蛋白质中的HIV蛋白酶切割位点。由于生物数据中存在复杂的关系,最近的一些研究表明,学习算法的集成通常比单独的方法表现更好。我们表明,基于不同编码模型的方法融合对于提高此分类问题的性能可能是有用的。特别是,在这项工作中描述并测试了四种不同的肽特征编码。报告了根据盲测协议在大型数据集上进行的广泛评估,该评估展示了如何将不同的特征提取方法和分类器结合起来以获得一个强大且可靠的系统。与其他单独方法的比较可以量化这项工作中提出的集成所获得的性能提升。