Raghuraj Rao, Lakshminarayanan Samavedham
Department of Chemical and Biomolecular Engineering, 4 Engineering Drive 4, National University of Singapore, Singapore.
FEBS Lett. 2007 Mar 6;581(5):826-30. doi: 10.1016/j.febslet.2007.01.052. Epub 2007 Feb 2.
Data classification algorithms applied for class prediction in computational biology literature are data specific and have shown varying degrees of performance. Different classes cannot be distinguished solely based on interclass distances or decision boundaries. We propose that inter-relations among the features be exploited for separating observations into specific classes. A new variable predictive model based class discrimination (VPMCD) method is described here. Three well established and proven data sets of varying statistical and biological significance are utilized as benchmark. The performance of the new method is compared with advanced classification algorithms. The new method performs better during different tests and shows higher stability and robustness. The VPMCD is observed to be a potentially strong classification approach and can be effectively extended to other data mining applications involving biological systems.
应用于计算生物学文献中进行类别预测的数据分类算法是特定于数据的,并且已经显示出不同程度的性能。不能仅基于类间距离或决策边界来区分不同的类别。我们建议利用特征之间的相互关系将观测值分离到特定类别中。本文描述了一种基于可变预测模型的新的类别判别(VPMCD)方法。使用三个具有不同统计和生物学意义的成熟且经过验证的数据集作为基准。将新方法的性能与先进的分类算法进行比较。新方法在不同测试中表现更好,并且显示出更高的稳定性和鲁棒性。VPMCD被认为是一种潜在强大的分类方法,并且可以有效地扩展到涉及生物系统的其他数据挖掘应用中。