Bhat Suhaas, Palepu Kalyan, Hong Lauren, Mao Joey, Ye Tianzheng, Iyer Rema, Zhao Lin, Chen Tianlai, Vincoff Sophia, Watson Rio, Wang Tian Z, Srijay Divya, Kavirayuni Venkata Srikar, Kholina Kseniia, Goel Shrey, Vure Pranay, Deshpande Aniruddha J, Soderling Scott H, DeLisa Matthew P, Chatterjee Pranam
Department of Biomedical Engineering, Duke University, Durham, NC, USA.
Department of Cell Biology, Duke University, Durham, NC, USA.
Sci Adv. 2025 Jan 24;11(4):eadr8638. doi: 10.1126/sciadv.adr8638. Epub 2025 Jan 22.
Designing binders to target undruggable proteins presents a formidable challenge in drug discovery. In this work, we provide an algorithmic framework to design short, target-binding linear peptides, requiring only the amino acid sequence of the target protein. To do this, we propose a process to generate naturalistic peptide candidates through Gaussian perturbation of the peptidic latent space of the ESM-2 protein language model and subsequently screen these novel sequences for target-selective interaction activity via a contrastive language-image pretraining (CLIP)-based contrastive learning architecture. By integrating these generative and discriminative steps, we create a Peptide Prioritization via CLIP (PepPrCLIP) pipeline and validate highly ranked, target-specific peptides experimentally, both as inhibitory peptides and as fusions to E3 ubiquitin ligase domains. PepPrCLIP-derived constructs demonstrate functionally potent binding and degradation of conformationally diverse, disease-driving targets in vitro. In total, PepPrCLIP empowers the modulation of previously inaccessible proteins without reliance on stable and ordered tertiary structures.
设计能够靶向不可成药蛋白的结合剂是药物研发中一项艰巨的挑战。在这项工作中,我们提供了一个算法框架来设计短的、靶向结合的线性肽,只需要目标蛋白的氨基酸序列。为此,我们提出了一个通过对ESM-2蛋白质语言模型的肽潜在空间进行高斯扰动来生成自然主义肽候选物的过程,随后通过基于对比语言-图像预训练(CLIP)的对比学习架构筛选这些新序列的目标选择性相互作用活性。通过整合这些生成和判别步骤,我们创建了一个通过CLIP进行肽优先排序(PepPrCLIP)的流程,并通过实验验证了排名靠前的、目标特异性肽,既作为抑制肽,也作为与E3泛素连接酶结构域的融合物。源自PepPrCLIP的构建体在体外展示了对构象多样、驱动疾病的靶点的功能强大的结合和降解能力。总的来说,PepPrCLIP能够在不依赖稳定且有序的三级结构的情况下调节以前难以靶向的蛋白质。