Zhang Jian, Zhao Xiaowei, Sun Pingping, Ma Zhiqiang
School of Computer Science and Information Technology, Northeast Normal University, Changchun 130017, China.
Int J Mol Sci. 2014 Jun 25;15(7):11204-19. doi: 10.3390/ijms150711204.
S-nitrosylation (SNO) is one of the most universal reversible post-translational modifications involved in many biological processes. Malfunction or dysregulation of SNO leads to a series of severe diseases, such as developmental abnormalities and various diseases. Therefore, the identification of SNO sites (SNOs) provides insights into disease progression and drug development. In this paper, a new bioinformatics tool, named PSNO, is proposed to identify SNOs from protein sequences. Firstly, we explore various promising sequence-derived discriminative features, including the evolutionary profile, the predicted secondary structure and the physicochemical properties. Secondly, rather than simply combining the features, which may bring about information redundancy and unwanted noise, we use the relative entropy selection and incremental feature selection approach to select the optimal feature subsets. Thirdly, we train our model by the technique of the k-nearest neighbor algorithm. Using both informative features and an elaborate feature selection scheme, our method, PSNO, achieves good prediction performance with a mean Mathews correlation coefficient (MCC) value of about 0.5119 on the training dataset using 10-fold cross-validation. These results indicate that PSNO can be used as a competitive predictor among the state-of-the-art SNOs prediction tools. A web-server, named PSNO, which implements the proposed method, is freely available at http://59.73.198.144:8088/PSNO/.
S-亚硝基化(SNO)是参与许多生物过程的最普遍的可逆翻译后修饰之一。SNO功能异常或失调会导致一系列严重疾病,如发育异常和各种疾病。因此,SNO位点(SNOs)的识别为疾病进展和药物开发提供了见解。本文提出了一种名为PSNO的新生物信息学工具,用于从蛋白质序列中识别SNOs。首先,我们探索了各种有前景的序列衍生判别特征,包括进化谱、预测的二级结构和理化性质。其次,我们不是简单地组合这些特征,因为这可能会带来信息冗余和不必要的噪声,而是使用相对熵选择和增量特征选择方法来选择最优特征子集。第三,我们通过k近邻算法技术训练模型。利用信息丰富的特征和精心设计的特征选择方案,我们的方法PSNO在使用10折交叉验证的训练数据集上取得了良好的预测性能,平均马修斯相关系数(MCC)值约为0.5119。这些结果表明,PSNO可以作为最先进的SNOs预测工具中的一个有竞争力的预测器。一个名为PSNO的网络服务器实现了所提出的方法,可在http://59.73.198.144:8088/PSNO/免费获得。