Suppr超能文献

PEL-PVP:基于 PEFT ESM-2 和双层 LSTM 的植物液泡蛋白鉴别器在不平衡数据集上的应用。

PEL-PVP: Application of plant vacuolar protein discriminator based on PEFT ESM-2 and bilayer LSTM in an unbalanced dataset.

机构信息

School of Computer Science and Technology, Hainan University, Haikou 570228, China.

School of Computer Science and Technology, Hainan University, Haikou 570228, China.

出版信息

Int J Biol Macromol. 2024 Oct;277(Pt 3):134317. doi: 10.1016/j.ijbiomac.2024.134317. Epub 2024 Jul 31.

Abstract

Plant vacuoles, play a crucial role in maintaining cellular stability, adapting to environmental changes, and responding to external pressures. The accurate identification of vacuolar proteins (PVPs) is crucial for understanding the biosynthetic mechanisms of intracellular vacuoles and the adaptive mechanisms of plants. In order to more accurately identify vacuole proteins, this study developed a new predictive model PEL-PVP based on ESM-2. Through this study, the feasibility and effectiveness of using advanced pre-training models and fine-tuning techniques for bioinformatics tasks were demonstrated, providing new methods and ideas for plant vacuolar protein research. In addition, previous datasets for vacuolar proteins were balanced, but imbalance is more closely related to the actual situation. Therefore, this study constructed an imbalanced dataset UB-PVP from the UniProt database,helping the model better adapt to the complexity and uncertainty in real environments, thereby improving the model's generalization ability and practicality. The experimental results show that compared with existing recognition techniques, achieving significant improvements in multiple indicators, with 6.08 %, 13.51 %, 11.9 %, and 5 % improvements in ACC, SP, MCC, and AUC, respectively. The accuracy reaches 94.59 %, significantly higher than the previous best model GraphIdn. This provides an efficient and precise tool for the study of plant vacuole proteins.

摘要

植物液泡在维持细胞稳定性、适应环境变化和应对外部压力方面起着至关重要的作用。准确识别液泡蛋白(PVPs)对于理解细胞内液泡的生物合成机制和植物的适应机制至关重要。为了更准确地识别液泡蛋白,本研究基于 ESM-2 开发了一种新的预测模型 PEL-PVP。通过这项研究,展示了使用先进的预训练模型和微调技术进行生物信息学任务的可行性和有效性,为植物液泡蛋白研究提供了新的方法和思路。此外,以前的液泡蛋白数据集是平衡的,但不平衡与实际情况更密切相关。因此,本研究从 UniProt 数据库构建了一个不平衡数据集 UB-PVP,帮助模型更好地适应真实环境中的复杂性和不确定性,从而提高模型的泛化能力和实用性。实验结果表明,与现有识别技术相比,该模型在多个指标上都取得了显著的改进,ACC、SP、MCC 和 AUC 分别提高了 6.08%、13.51%、11.9%和 5%。准确率达到 94.59%,明显高于之前最好的模型 GraphIdn。这为植物液泡蛋白的研究提供了一个高效、精确的工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验