Jiao Shihu, Ye Xiucai, Sakurai Tetsuya, Zou Quan, Han Wu, Zhan Chao
Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.
Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
BMC Biol. 2025 Jul 28;23(1):229. doi: 10.1186/s12915-025-02329-1.
Peptide-based therapeutics have great potential due to their versatility, high specificity, and suitability for a variety of therapeutic applications. Despite these advantages, the inherent toxicities of some peptides pose challenges in drug development. Several computational methods have been developed to allow rapid and efficient large-scale screening of peptide toxicity. However, these methods mainly rely on the primary sequence and often ignore critical structural information, which limits their predictive accuracy.
In this study, we introduce a novel framework named StrucToxNet that integrates a pre-trained protein language model with an equivariant graph neural network to improve peptide toxicity prediction. By combining sequence embeddings from the ProtT5 language model and 3D structural data predicted by ESMFold, StrucToxNet can capture both sequential and spatial characteristics of peptides. Testing on the independent dataset indicates that StrucToxNet outperforms existing sequence-based models in various metrics, achieving higher balanced accuracy and overall performance.
The results demonstrate the robustness and generalizability of StrucToxNet, marking it a reliable tool in the computational screening of toxic peptides and facilitating safer peptide-based drug development.
基于肽的疗法因其多功能性、高特异性以及适用于多种治疗应用而具有巨大潜力。尽管具有这些优势,但某些肽的固有毒性在药物开发中带来了挑战。已经开发了几种计算方法,以实现对肽毒性的快速高效大规模筛选。然而,这些方法主要依赖于一级序列,并且常常忽略关键的结构信息,这限制了它们的预测准确性。
在本研究中,我们引入了一种名为StrucToxNet的新型框架,该框架将预训练的蛋白质语言模型与等变图神经网络相结合,以改进肽毒性预测。通过结合ProtT5语言模型的序列嵌入和ESMFold预测的3D结构数据,StrucToxNet可以捕捉肽的序列和空间特征。在独立数据集上的测试表明,StrucToxNet在各种指标上均优于现有的基于序列的模型,实现了更高的平衡准确率和整体性能。
结果证明了StrucToxNet的稳健性和通用性,使其成为有毒肽计算筛选中的可靠工具,并促进了更安全的基于肽的药物开发。