Suppr超能文献

利用 ELECTRA 预测蛋白质离子配体结合位点

Prediction of Protein Ion-Ligand Binding Sites with ELECTRA.

机构信息

Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.

出版信息

Molecules. 2023 Sep 25;28(19):6793. doi: 10.3390/molecules28196793.

Abstract

Interactions between proteins and ions are essential for various biological functions like structural stability, metabolism, and signal transport. Given that more than half of all proteins bind to ions, it is becoming crucial to identify ion-binding sites. The accurate identification of protein-ion binding sites helps us to understand proteins' biological functions and plays a significant role in drug discovery. While several computational approaches have been proposed, this remains a challenging problem due to the small size and high versatility of metals and acid radicals. In this study, we propose IonPred, a sequence-based approach that employs ELECTRA (Efficiently Learning an Encoder that Classifies Token Replacements Accurately) to predict ion-binding sites using only raw protein sequences. We successfully fine-tuned our pretrained model to predict the binding sites for nine metal ions (Zn, Cu, Fe, Fe, Ca, Mg, Mn, Na, and K) and four acid radical ion ligands (CO, SO, PO, NO). IonPred surpassed six current state-of-the-art tools by over 44.65% and 28.46%, respectively, in the 1 score and MCC when compared on an independent test dataset. Our method is more computationally efficient than existing tools, producing prediction results for a hundred sequences for a specific ion in under ten minutes.

摘要

蛋白质与离子之间的相互作用对于各种生物功能至关重要,如结构稳定性、代谢和信号传递。鉴于超过一半的蛋白质与离子结合,因此识别离子结合位点变得至关重要。准确识别蛋白质-离子结合位点有助于我们了解蛋白质的生物学功能,并在药物发现中发挥重要作用。尽管已经提出了几种计算方法,但由于金属和酸根的体积小且多功能性,这仍然是一个具有挑战性的问题。在这项研究中,我们提出了 IonPred,这是一种基于序列的方法,仅使用原始蛋白质序列,采用 ELECTRA(有效地学习准确分类标记替换的编码器)来预测离子结合位点。我们成功地微调了我们的预训练模型,以预测九种金属离子(Zn、Cu、Fe、Fe、Ca、Mg、Mn、Na 和 K)和四种酸根离子配体(CO、SO、PO、NO)的结合位点。与独立测试数据集相比,IonPred 在 1 分和 MCC 上分别超过了六个当前最先进的工具 44.65%和 28.46%。与现有工具相比,我们的方法计算效率更高,可在不到十分钟的时间内为特定离子的一百个序列生成预测结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0553/10574437/2d217d13230b/molecules-28-06793-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验