Hasan Md Mehedi, Alam Md Ashad, Shoombuatong Watshara, Kurata Hiroyuki
Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan.
Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo, 102-0083, Japan.
J Comput Aided Mol Des. 2021 Mar;35(3):315-323. doi: 10.1007/s10822-020-00368-0. Epub 2021 Jan 4.
Redox-sensitive cysteine (RSC) thiol contributes to many biological processes. The identification of RSC plays an important role in clarifying some mechanisms of redox-sensitive factors; nonetheless, experimental investigation of RSCs is expensive and time-consuming. The computational approaches that quickly and accurately identify candidate RSCs using the sequence information are urgently needed. Herein, an improved and robust computational predictor named IRC-Fuse was developed to identify the RSC by fusing of multiple feature representations. To enhance the performance of our model, we integrated the probability scores evaluated by the random forest models implementing different encoding schemes. Cross-validation results exhibited that the IRC-Fuse achieved accuracy and AUC of 0.741 and 0.807, respectively. The IRC-Fuse outperformed exiting methods with improvement of 10% and 13% on accuracy and MCC, respectively, over independent test data. Comparative analysis suggested that the IRC-Fuse was more effective and promising than the existing predictors. For the convenience of experimental scientists, the IRC-Fuse online web server was implemented and publicly accessible at http://kurata14.bio.kyutech.ac.jp/IRC-Fuse/ .
氧化还原敏感型半胱氨酸(RSC)硫醇参与许多生物学过程。RSC的鉴定在阐明氧化还原敏感因子的某些机制中起着重要作用;然而,对RSC进行实验研究既昂贵又耗时。迫切需要利用序列信息快速准确地识别候选RSC的计算方法。在此,开发了一种名为IRC-Fuse的改进且强大的计算预测器,通过融合多种特征表示来识别RSC。为了提高我们模型的性能,我们整合了由实施不同编码方案的随机森林模型评估的概率分数。交叉验证结果表明,IRC-Fuse的准确率和AUC分别达到了0.741和0.807。在独立测试数据上,IRC-Fuse在准确率和马修斯相关系数(MCC)方面分别比现有方法提高了10%和13%,表现优于现有方法。对比分析表明,IRC-Fuse比现有预测器更有效且更有前景。为方便实验科学家使用,IRC-Fuse在线网络服务器已实现,可通过http://kurata14.bio.kyutech.ac.jp/IRC-Fuse/ 公开访问。