Department of Computer Science, Zhejiang Normal University, Zhejiang, China.
Protein J. 2010 Apr;29(3):195-203. doi: 10.1007/s10930-010-9240-x.
One of the main challenges faced by biological applications is to predict protein subcellular localization in an automatic fashion accurately. To achieve this in these applications, a wide variety of machine learning methods have been proposed in recent years. Most of them focus on finding the optimal classification scheme and less of them take the simplifying the complexity of biological system into account. Traditionally such bio-data are analyzed by first performing a feature selection before classification. Motivated by CS (Compressive Sensing), we propose a method which performs locality preserving projection with a sparseness criterion such that the feature selection and dimension reduction are merged into one analysis. The proposed sparse method decreases the complexity of biological system, while increases protein subcellular localization accuracy. Experimental results are quite encouraging, indicating that the aforementioned sparse method is quite promising in dealing with complicated biological problems, such as predicting the subcellular localization of Gram-negative bacterial proteins.
生物应用所面临的主要挑战之一是准确地自动预测蛋白质的亚细胞定位。为了在这些应用中实现这一目标,近年来提出了各种各样的机器学习方法。它们大多专注于寻找最优的分类方案,而较少考虑简化生物系统的复杂性。传统上,此类生物数据是通过在分类之前首先进行特征选择来进行分析的。受 CS(压缩感知)的启发,我们提出了一种方法,该方法使用稀疏性准则进行保局投影,从而将特征选择和降维合并到一个分析中。所提出的稀疏方法降低了生物系统的复杂性,同时提高了蛋白质亚细胞定位的准确性。实验结果非常令人鼓舞,表明上述稀疏方法在处理复杂的生物问题方面非常有前途,例如预测革兰氏阴性细菌蛋白质的亚细胞定位。