Zhang Peidong, Peng Xingang, Han Rong, Chen Ting, Ma Jianzhu
Department of Computer Science and Technology, Tsinghua University, Haidian District, Beijing 100084, China.
Institute of Artificial Intelligence, Tsinghua University, Haidian District, Beijing 100084, China.
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf265.
Artificial intelligence (AI) has brought tremendous progress to drug discovery, yet identifying hit and lead compounds with optimal physicochemical and pharmacological properties remains a significant challenge. Structure-based drug design (SBDD) has emerged as a promising paradigm, but the inherent data biases and ignorance of synthetic accessibility render SBDD models disconnected from practical drug discovery. In this work, we explore two methodologies, Rag2Mol-G and Rag2Mol-R, both based on retrieval-augmented generation to design small molecules to fit a 3D pocket. These two methods involve searching for similar small molecules that are purchasable in the database based on the generated ones or creating new molecules from those in the database that can fit into a 3D pocket. Experimental results demonstrate that Rag2Mol methods consistently produce drug candidates with superior binding affinities and drug-likeness. We find that Rag2Mol-R provides a broader coverage of the chemical landscapes and more precise targeting capability than advanced virtual screening models. Notably, both workflows identified promising inhibitors for the challenging target protein tyrosine phosphatases PTPN2, which was used to be considered undruggable and still lacks inhibitors that have completed full clinical trials. Our highly extensible framework can integrate diverse SBDD methods, marking a significant advancement in AI-driven SBDD.
人工智能(AI)为药物发现带来了巨大进展,但识别具有最佳物理化学和药理特性的活性和先导化合物仍然是一项重大挑战。基于结构的药物设计(SBDD)已成为一种有前景的范例,但固有的数据偏差和对合成可及性的忽视使得SBDD模型与实际药物发现脱节。在这项工作中,我们探索了两种方法,即Rag2Mol-G和Rag2Mol-R,它们都基于检索增强生成来设计适合三维口袋的小分子。这两种方法包括基于生成的小分子在数据库中搜索可购买的相似小分子,或者从数据库中能够适配三维口袋的分子创建新分子。实验结果表明,Rag2Mol方法始终能产生具有卓越结合亲和力和类药性质的候选药物。我们发现,与先进的虚拟筛选模型相比,Rag2Mol-R对化学空间的覆盖范围更广,靶向能力更精确。值得注意的是,这两种工作流程都为具有挑战性的靶蛋白酪氨酸磷酸酶PTPN2鉴定出了有前景的抑制剂,该靶点过去被认为难以成药,目前仍缺乏完成全面临床试验的抑制剂。我们高度可扩展的框架可以整合多种SBDD方法,标志着人工智能驱动的SBDD取得了重大进展。