IBM Research, Thomas J. Watson Research Center, Yorktown Heights, New York, NY, USA.
Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK.
Sci Adv. 2023 Jun 23;9(25):eadg7865. doi: 10.1126/sciadv.adg7865. Epub 2023 Jun 21.
Inhibitor discovery for emerging drug-target proteins is challenging, especially when target structure or active molecules are unknown. Here, we experimentally validate the broad utility of a deep generative framework trained at-scale on protein sequences, small molecules, and their mutual interactions-unbiased toward any specific target. We performed a protein sequence-conditioned sampling on the generative foundation model to design small-molecule inhibitors for two dissimilar targets: the spike protein receptor-binding domain (RBD) and the main protease from SARS-CoV-2. Despite using only the target sequence information during the model inference, micromolar-level inhibition was observed in vitro for two candidates out of four synthesized for each target. The most potent spike RBD inhibitor exhibited activity against several variants in live virus neutralization assays. These results establish that a single, broadly deployable generative foundation model for accelerated inhibitor discovery is effective and efficient, even in the absence of target structure or binder information.
新兴药物靶标蛋白的抑制剂发现具有挑战性,特别是在目标结构或活性分子未知的情况下。在这里,我们通过在蛋白质序列、小分子及其相互作用的大规模训练中对深度生成框架进行实验验证,该框架不受任何特定目标的影响。我们在生成基础模型上对蛋白质序列进行条件采样,为两个不同的靶标设计小分子抑制剂:刺突蛋白受体结合域(RBD)和 SARS-CoV-2 的主要蛋白酶。尽管在模型推断过程中仅使用了目标序列信息,但在所合成的四个候选物中,有两个对每个靶标都表现出了微摩尔级别的抑制作用。最有效的刺突 RBD 抑制剂在活病毒中和测定中对几种变体表现出活性。这些结果表明,即使在缺乏靶标结构或结合物信息的情况下,用于加速抑制剂发现的单一、广泛可部署的生成基础模型也是有效且高效的。