Suppr超能文献

CPL-Diff:一种用于从头设计固定长度功能肽序列的扩散模型。

CPL-Diff: A Diffusion Model for De Novo Design of Functional Peptide Sequences with Fixed Length.

作者信息

Luo Zhenjie, Geng Aoyun, Wei Leyi, Zou Quan, Cui Feifei, Zhang Zilong

机构信息

College of Computer Science and Technology, Hainan University, No. 58, Renmin Avenue, Haikou, 570228, China.

Centre for Artificial Intelligence driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, Macao SAR, 999078, China.

出版信息

Adv Sci (Weinh). 2025 May;12(20):e2412926. doi: 10.1002/advs.202412926. Epub 2025 Apr 15.

Abstract

Peptides are recognized as next-generation therapeutic drugs due to their unique properties and are essential for treating human diseases. In recent years, a number of deep generation models for generating peptides have been proposed and have shown great potential. However, these models cannot well control the length of the generated sequence, while the sequence length has a very important impact on the physical and chemical properties and therapeutic effects of peptides. Here, a diffusion model is introduced, capable of controlling the length of generated functional peptide sequences, named CPL-Diff. CPL-Diff can control the length of generated polypeptide sequences using only attention masking. Additionally, CPL-Diff can generate single-functional polypeptide sequences based on given conditional information. Experiments demonstrate that the peptides generated by CPL-Diff exhibit lower perplexity and similarity compared to those produced by the current state-of-the-art models, and further exhibit relevant physicochemical properties similar to real sequences. The interpretability analysis is also performed on CPL-Diff to understand how it controls the length of generated sequences and the decision-making process involved in generating polypeptide sequences, with the aim of providing important theoretical guidance for polypeptide design. The code for CPL-Diff is available at https://github.com/luozhenjie1997/CPL-Diff.

摘要

由于其独特的性质,肽被认为是下一代治疗药物,对治疗人类疾病至关重要。近年来,已经提出了许多用于生成肽的深度生成模型,并显示出巨大的潜力。然而,这些模型不能很好地控制生成序列的长度,而序列长度对肽的物理化学性质和治疗效果有非常重要的影响。在此,引入了一种能够控制生成的功能性肽序列长度的扩散模型,名为CPL-Diff。CPL-Diff仅使用注意力掩码就能控制生成的多肽序列的长度。此外,CPL-Diff可以根据给定的条件信息生成单功能多肽序列。实验表明,与当前最先进的模型生成的肽相比,CPL-Diff生成的肽具有更低的困惑度和相似度,并且进一步表现出与真实序列相似的相关物理化学性质。还对CPL-Diff进行了解释性分析,以了解它如何控制生成序列的长度以及生成多肽序列所涉及的决策过程,旨在为多肽设计提供重要的理论指导。CPL-Diff的代码可在https://github.com/luozhenjie1997/CPL-Diff获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/097d/12120732/9752aad7f34b/ADVS-12-2412926-g008.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验