Suppr超能文献

利用 ELECTRA 预测蛋白质离子配体结合位点

Prediction of Protein Ion-Ligand Binding Sites with ELECTRA.

机构信息

Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.

出版信息

Molecules. 2023 Sep 25;28(19):6793. doi: 10.3390/molecules28196793.

Abstract

Interactions between proteins and ions are essential for various biological functions like structural stability, metabolism, and signal transport. Given that more than half of all proteins bind to ions, it is becoming crucial to identify ion-binding sites. The accurate identification of protein-ion binding sites helps us to understand proteins' biological functions and plays a significant role in drug discovery. While several computational approaches have been proposed, this remains a challenging problem due to the small size and high versatility of metals and acid radicals. In this study, we propose IonPred, a sequence-based approach that employs ELECTRA (Efficiently Learning an Encoder that Classifies Token Replacements Accurately) to predict ion-binding sites using only raw protein sequences. We successfully fine-tuned our pretrained model to predict the binding sites for nine metal ions (Zn, Cu, Fe, Fe, Ca, Mg, Mn, Na, and K) and four acid radical ion ligands (CO, SO, PO, NO). IonPred surpassed six current state-of-the-art tools by over 44.65% and 28.46%, respectively, in the 1 score and MCC when compared on an independent test dataset. Our method is more computationally efficient than existing tools, producing prediction results for a hundred sequences for a specific ion in under ten minutes.

摘要

蛋白质与离子之间的相互作用对于各种生物功能至关重要,如结构稳定性、代谢和信号传递。鉴于超过一半的蛋白质与离子结合,因此识别离子结合位点变得至关重要。准确识别蛋白质-离子结合位点有助于我们了解蛋白质的生物学功能,并在药物发现中发挥重要作用。尽管已经提出了几种计算方法,但由于金属和酸根的体积小且多功能性,这仍然是一个具有挑战性的问题。在这项研究中,我们提出了 IonPred,这是一种基于序列的方法,仅使用原始蛋白质序列,采用 ELECTRA(有效地学习准确分类标记替换的编码器)来预测离子结合位点。我们成功地微调了我们的预训练模型,以预测九种金属离子(Zn、Cu、Fe、Fe、Ca、Mg、Mn、Na 和 K)和四种酸根离子配体(CO、SO、PO、NO)的结合位点。与独立测试数据集相比,IonPred 在 1 分和 MCC 上分别超过了六个当前最先进的工具 44.65%和 28.46%。与现有工具相比,我们的方法计算效率更高,可在不到十分钟的时间内为特定离子的一百个序列生成预测结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0553/10574437/2d217d13230b/molecules-28-06793-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验