Department of Information and Computer Science, University of Science and Technology Beijing, Beijing, China.
PLoS One. 2013;8(2):e55844. doi: 10.1371/journal.pone.0055844. Epub 2013 Feb 7.
Posttranslational modifications (PTMs) of proteins are responsible for sensing and transducing signals to regulate various cellular functions and signaling events. S-nitrosylation (SNO) is one of the most important and universal PTMs. With the avalanche of protein sequences generated in the post-genomic age, it is highly desired to develop computational methods for timely identifying the exact SNO sites in proteins because this kind of information is very useful for both basic research and drug development. Here, a new predictor, called iSNO-PseAAC, was developed for identifying the SNO sites in proteins by incorporating the position-specific amino acid propensity (PSAAP) into the general form of pseudo amino acid composition (PseAAC). The predictor was implemented using the conditional random field (CRF) algorithm. As a demonstration, a benchmark dataset was constructed that contains 731 SNO sites and 810 non-SNO sites. To reduce the homology bias, none of these sites were derived from the proteins that had [Formula: see text] pairwise sequence identity to any other. It was observed that the overall cross-validation success rate achieved by iSNO-PseAAC in identifying nitrosylated proteins on an independent dataset was over 90%, indicating that the new predictor is quite promising. Furthermore, a user-friendly web-server for iSNO-PseAAC was established at http://app.aporc.org/iSNO-PseAAC/, by which users can easily obtain the desired results without the need to follow the mathematical equations involved during the process of developing the prediction method. It is anticipated that iSNO-PseAAC may become a useful high throughput tool for identifying the SNO sites, or at the very least play a complementary role to the existing methods in this area.
蛋白质的翻译后修饰(PTMs)负责感应和转导信号,以调节各种细胞功能和信号事件。S-亚硝基化(SNO)是最重要和最普遍的 PTM 之一。在后基因组时代,随着蛋白质序列的大量涌现,开发用于及时识别蛋白质中确切 SNO 位点的计算方法是非常需要的,因为这种信息对于基础研究和药物开发都非常有用。在这里,我们开发了一种新的预测器,称为 iSNO-PseAAC,通过将位置特异性氨基酸倾向(PSAAP)纳入通用形式的伪氨基酸组成(PseAAC)来识别蛋白质中的 SNO 位点。该预测器是使用条件随机场(CRF)算法实现的。作为演示,我们构建了一个基准数据集,其中包含 731 个 SNO 位点和 810 个非 SNO 位点。为了减少同源性偏差,这些位点都不是从与其他任何蛋白质具有[Formula: see text]序列同一性的蛋白质中衍生出来的。观察到,iSNO-PseAAC 在独立数据集上识别硝化蛋白质的整体交叉验证成功率超过 90%,表明该新预测器非常有前途。此外,我们在 http://app.aporc.org/iSNO-PseAAC/ 上建立了一个 iSNO-PseAAC 的用户友好型网络服务器,用户可以轻松地获得所需的结果,而无需在开发预测方法的过程中遵循涉及的数学方程。预计 iSNO-PseAAC 可能成为识别 SNO 位点的有用高通量工具,或者至少在该领域的现有方法中发挥补充作用。