SIMLIN：一种基于多阶段集成学习模型的人类蛋白质组中 S-亚磺酰化预测的生物信息学工具。

SIMLIN: a bioinformatics tool for prediction of S-sulphenylation in the human proteome based on multi-stage ensemble-learning models.

机构信息

Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, 3800, Australia.

Division of Cancer Epidemiology, Cancer Council Victoria, Melbourne, VIC, 3004, Australia.

出版信息

BMC Bioinformatics. 2019 Nov 21;20(1):602. doi: 10.1186/s12859-019-3178-6.

DOI:10.1186/s12859-019-3178-6

PMID:31752668

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6868744/

Abstract

BACKGROUND

S-sulphenylation is a ubiquitous protein post-translational modification (PTM) where an S-hydroxyl (-SOH) bond is formed via the reversible oxidation on the Sulfhydryl group of cysteine (C). Recent experimental studies have revealed that S-sulphenylation plays critical roles in many biological functions, such as protein regulation and cell signaling. State-of-the-art bioinformatic advances have facilitated high-throughput in silico screening of protein S-sulphenylation sites, thereby significantly reducing the time and labour costs traditionally required for the experimental investigation of S-sulphenylation.

RESULTS

In this study, we have proposed a novel hybrid computational framework, termed SIMLIN, for accurate prediction of protein S-sulphenylation sites using a multi-stage neural-network based ensemble-learning model integrating both protein sequence derived and protein structural features. Benchmarking experiments against the current state-of-the-art predictors for S-sulphenylation demonstrated that SIMLIN delivered competitive prediction performance. The empirical studies on the independent testing dataset demonstrated that SIMLIN achieved 88.0% prediction accuracy and an AUC score of 0.82, which outperforms currently existing methods.

CONCLUSIONS

In summary, SIMLIN predicts human S-sulphenylation sites with high accuracy thereby facilitating biological hypothesis generation and experimental validation. The web server, datasets, and online instructions are freely available at http://simlin.erc.monash.edu/ for academic purposes.

摘要

背景

S-亚磺酰化是一种普遍存在的蛋白质翻译后修饰（PTM），其中通过半胱氨酸（C）的巯基的可逆氧化形成 S-羟基（-SOH）键。最近的实验研究表明，S-亚磺酰化在许多生物功能中起着关键作用，如蛋白质调节和细胞信号转导。最先进的生物信息学进展促进了蛋白质 S-亚磺酰化位点的高通量计算机筛选，从而大大减少了传统上对 S-亚磺酰化进行实验研究所需的时间和劳动力成本。

结果

在这项研究中，我们提出了一种新的混合计算框架，称为 SIMLIN，用于使用基于多阶段神经网络的集成学习模型准确预测蛋白质 S-亚磺酰化位点，该模型集成了蛋白质序列衍生和蛋白质结构特征。针对 S-亚磺酰化的最新预测器进行的基准实验表明，SIMLIN 提供了有竞争力的预测性能。在独立测试数据集上的实证研究表明，SIMLIN 实现了 88.0%的预测准确率和 0.82 的 AUC 得分，优于现有方法。

结论

总之，SIMLIN 可以高精度地预测人类 S-亚磺酰化位点，从而促进生物学假设的生成和实验验证。该网络服务器、数据集和在线说明可在学术目的免费使用：http://simlin.erc.monash.edu/。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

SIMLIN：一种基于多阶段集成学习模型的人类蛋白质组中 S-亚磺酰化预测的生物信息学工具。

SIMLIN: a bioinformatics tool for prediction of S-sulphenylation in the human proteome based on multi-stage ensemble-learning models.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

SIMLIN：一种基于多阶段集成学习模型的人类蛋白质组中 S-亚磺酰化预测的生物信息学工具。

SIMLIN: a bioinformatics tool for prediction of S-sulphenylation in the human proteome based on multi-stage ensemble-learning models.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献