Liao Sisheng, Xu Gang, Jin Li, Ma Jianpeng
School of Life Sciences, Fudan University, Shanghai 200433, China.
Multiscale Research Institute of Complex Systems, Fudan University, Shanghai 200433, China.
Molecules. 2025 Feb 28;30(5):1116. doi: 10.3390/molecules30051116.
This study presents PolypeptideDesigner (PPD), a novel conditional diffusion-based model for de novo polypeptide sequence design and generation based on per-residue secondary structure conditions. By integrating a lightweight LSTM-attention neural network as the denoiser within a diffusion framework, PPD offers an innovative and efficient approach to polypeptide generation. Evaluations demonstrate that the PPD model can generate diverse and novel polypeptide sequences across various testing conditions, achieving high pLDDT scores when folded by ESMFold. In comparison to the ProteinDiffusionGenerator B (PDG-B) model, a relevant benchmark in the field, PPD exhibits the ability to produce longer and more diverse polypeptide sequences. This improvement is attributed to PPD's optimized architecture and expanded training dataset, which enhance its understanding of protein structural pattern. The PPD model shows significant potential for optimizing functional polypeptides with known structures, paving the way for advancements in biomaterial design. Future work will focus on further refining the model and exploring its broader applications in polypeptide engineering.
本研究介绍了多肽设计器(PPD),这是一种基于每个残基二级结构条件的、用于从头进行多肽序列设计和生成的新型条件扩散模型。通过在扩散框架内集成一个轻量级的长短期记忆注意力神经网络作为去噪器,PPD提供了一种创新且高效的多肽生成方法。评估表明,PPD模型能够在各种测试条件下生成多样且新颖的多肽序列,经ESMFold折叠后可获得较高的pLDDT分数。与该领域的相关基准模型蛋白质扩散生成器B(PDG-B)相比,PPD展现出能够生成更长且更多样化的多肽序列的能力。这种改进归因于PPD优化的架构和扩展的训练数据集,它们增强了模型对蛋白质结构模式的理解。PPD模型在优化具有已知结构的功能性多肽方面显示出巨大潜力,为生物材料设计的进步铺平了道路。未来的工作将集中于进一步优化该模型,并探索其在多肽工程中的更广泛应用。