Guan Jiahui, Xie Peilin, Meng Dian, Yao Lantian, Yu Dan, Chiang Ying-Chih, Lee Tzong-Yi, Wang Junwen
Division of Applied Oral Sciences and Community Dental Care, Faculty of Dentistry, The University of Hong Kong, Hong Kong, China.
Kobilka Institute of Innovative Drug Discovery, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Road, 51872, Shenzhen, China.
Comput Struct Biotechnol J. 2025 May 28;27:2347-2358. doi: 10.1016/j.csbj.2025.05.039. eCollection 2025.
Peptide-based therapeutics have emerged as a promising avenue in drug development, offering high biocompatibility, specificity, and efficacy. However, the potential toxicity of peptides remains a significant challenge, necessitating the development of robust toxicity prediction methods. In this study, we introduce ToxiPep, a novel dual-model framework for peptide toxicity prediction that integrates sequence-based contextual information with atomic-level structural features. This framework combines BiGRU and Transformer to capture local and global sequence dependencies while leveraging multi-scale CNNs to extract refined structural features from molecular graphs derived from peptide SMILES representations. A cross-attention mechanism aligns and fuses these two feature modalities, enabling the model to capture intricate relationships between sequence and structural information. ToxiPep outperforms several state-of-the-art tools, including ToxinPred2, CSM-Toxin, PepNet, and ToxinPred3, on both internal and independent test sets. Additionally, interpretability analyses reveal that ToxiPep identifies key amino acids along with their structural features, providing insights into the molecular mechanisms of peptide toxicity. To facilitate broader accessibility, we have also developed a web server for convenient user access. Overall, this framework has the potential to accelerate the identification of safer therapeutic peptides, offering new opportunities for peptide-based drug development in precision medicine.
基于肽的疗法已成为药物开发中一条有前景的途径,具有高生物相容性、特异性和疗效。然而,肽的潜在毒性仍然是一个重大挑战,因此需要开发强大的毒性预测方法。在本研究中,我们引入了ToxiPep,这是一种用于肽毒性预测的新型双模型框架,它将基于序列的上下文信息与原子级结构特征相结合。该框架结合了双向门控循环单元(BiGRU)和变换器(Transformer)来捕获局部和全局序列依赖性,同时利用多尺度卷积神经网络(CNNs)从肽的简化分子线性输入规范(SMILES)表示衍生的分子图中提取精细的结构特征。一种交叉注意力机制对齐并融合这两种特征模态,使模型能够捕获序列和结构信息之间的复杂关系。在内部和独立测试集上,ToxiPep均优于几种先进工具,包括ToxinPred2、CSM-Toxin、PepNet和ToxinPred3。此外,可解释性分析表明,ToxiPep识别出关键氨基酸及其结构特征,为肽毒性的分子机制提供了见解。为了便于更广泛的访问,我们还开发了一个网络服务器,方便用户使用。总体而言,该框架有可能加速更安全治疗性肽的识别,为精准医学中基于肽的药物开发提供新机会。