Chen Tianlai, Dumas Madeleine, Watson Rio, Vincoff Sophia, Peng Christina, Zhao Lin, Hong Lauren, Pertsemlidis Sarah, Shaepers-Cheu Mayumi, Wang Tian Zi, Srijay Divya, Monticello Connor, Vure Pranay, Pulugurta Rishab, Kholina Kseniia, Goel Shrey, DeLisa Matthew P, Truant Ray, Aguilar Hector C, Chatterjee Pranam
Department of Biomedical Engineering, Duke University.
Department of Microbiology and Immunology, College of Veterinary Medicine, Cornell University.
ArXiv. 2024 Aug 11:arXiv:2310.03842v3.
Target proteins that lack accessible binding pockets and conformational stability have posed increasing challenges for drug development. Induced proximity strategies, such as PROTACs and molecular glues, have thus gained attention as pharmacological alternatives, but still require small molecule docking at binding pockets for targeted protein degradation. The computational design of protein-based binders presents unique opportunities to access "undruggable" targets, but have often relied on stable 3D structures or structure-influenced latent spaces for effective binder generation. In this work, we introduce , a target sequence-conditioned generator of linear peptide binders. By employing a novel span masking strategy that uniquely positions cognate peptide sequences at the C-terminus of target protein sequences, PepMLM fine-tunes the state-of-the-art ESM-2 pLM to fully reconstruct the binder region, achieving low perplexities matching or improving upon validated peptide-protein sequence pairs. After successful benchmarking with AlphaFold-Multimer, outperforming RFDiffusion on structured targets, we experimentally verify PepMLM's efficacy via fusion of model-derived peptides to E3 ubiquitin ligase domains, demonstrating endogenous degradation of emergent viral phosphoproteins and Huntington's disease-driving proteins. In total, PepMLM enables the generative design of candidate binders to any target protein, without the requirement of target structure, empowering downstream therapeutic applications.
缺乏可及结合口袋和构象稳定性的靶蛋白给药物开发带来了越来越大的挑战。因此,诸如PROTAC和分子胶等诱导邻近策略作为药理学替代方法受到了关注,但仍需要小分子对接至结合口袋以实现靶向蛋白降解。基于蛋白质的结合剂的计算设计为攻克“不可成药”靶点提供了独特机遇,但通常依赖稳定的三维结构或受结构影响的潜在空间来有效生成结合剂。在这项工作中,我们介绍了PepMLM,一种由靶序列条件化的线性肽结合剂生成器。通过采用一种新颖的跨度掩码策略,将同源肽序列独特地定位在靶蛋白序列的C末端,PepMLM对最先进的ESM-2蛋白质语言模型进行微调,以完全重建结合剂区域,实现了与经过验证的肽-蛋白质序列对相匹配或更优的低困惑度。在用AlphaFold-Multimer成功进行基准测试后,在结构化靶点上优于RFDiffusion,我们通过将模型衍生的肽与E3泛素连接酶结构域融合,实验验证了PepMLM的功效,证明了新兴病毒磷蛋白和亨廷顿病驱动蛋白的内源性降解。总的来说,PepMLM能够生成针对任何靶蛋白的候选结合剂,而无需靶结构,为下游治疗应用提供了支持。