Graduate School of Integrated Energy-AI, Jeonbuk National University, Jeonju, 54896, South Korea.
Department of Electrical Engineering, The University of Azad Jammu and Kashmir, Muzaffarabad, 13100, Pakistan; Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, South Korea.
Comput Biol Med. 2024 Dec;183:109200. doi: 10.1016/j.compbiomed.2024.109200. Epub 2024 Oct 3.
Protein nitrotyrosine is an essential post-translational modification that results from the nitration of tyrosine amino acid residues. This modification is known to be associated with the regulation and characterization of several biological functions and diseases. Therefore, accurate identification of nitrotyrosine sites plays a significant role in the elucidating progress of associated biological signs. In this regard, we reported an accurate computational tool known as iNTyro-Stack for the identification of protein nitrotyrosine sites. iNTyro-Stack is a machine-learning model based on a stacking algorithm. The base classifiers in stacking are selected based on the highest performance. The feature map employed is a linear combination of the amino composition encoding schemes, including the composition of k-spaced amino acid pairs and tri-peptide composition. The recursive feature elimination technique is used for significant feature selection. The performance of the proposed method is evaluated using k-fold cross-validation and independent testing approaches. iNTyro-Stack achieved an accuracy of 86.3% and a Matthews correlation coefficient (MCC) of 72.6% in cross-validation. Its generalization capability was further validated on an imbalanced independent test set, where it attained an accuracy of 69.32%. iNTyro-Stack outperforms existing state-of-the-art methods across both evaluation techniques. The github repository is create to reproduce the method and results of iNTyro-Stack, accessible on: https://github.com/waleed551/iNTyro-Stack/.
蛋白质硝基酪氨酸是一种重要的翻译后修饰,源于酪氨酸氨基酸残基的硝化。这种修饰与几种生物功能和疾病的调节和特征有关。因此,准确识别硝基酪氨酸位点在阐明相关生物学标志的进展中起着重要作用。在这方面,我们报道了一种称为 iNTyro-Stack 的准确计算工具,用于识别蛋白质硝基酪氨酸位点。iNTyro-Stack 是一种基于堆叠算法的机器学习模型。堆叠中的基础分类器是根据最高性能选择的。所使用的特征图是氨基酸组成编码方案的线性组合,包括 k 间隔氨基酸对和三肽组成的组成。递归特征消除技术用于显著特征选择。使用 k 折交叉验证和独立测试方法评估了所提出方法的性能。iNTyro-Stack 在交叉验证中达到了 86.3%的准确性和 72.6%的马修斯相关系数 (MCC)。它的泛化能力在不平衡的独立测试集上进一步得到了验证,在该测试集上达到了 69.32%的准确性。iNTyro-Stack 在这两种评估技术上均优于现有的最先进方法。该方法的 github 存储库可重现,可在:https://github.com/waleed551/iNTyro-Stack/ 上访问。