DrugGym：自主药物研发经济学的试验平台。

DrugGym: A testbed for the economics of autonomous drug discovery.

作者信息

Retchin Michael, Wang Yuanqing, Takaba Kenichiro, Chodera John D

机构信息

Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medical College, Cornell University, New York, NY 10065.

Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065.

出版信息

bioRxiv. 2024 Jun 2:2024.05.28.596296. doi: 10.1101/2024.05.28.596296.

DOI:10.1101/2024.05.28.596296

PMID:38854082

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11160604/

Abstract

Drug discovery is stochastic. The effectiveness of candidate compounds in satisfying design objectives is unknown ahead of time, and the tools used for prioritization-predictive models and assays-are inaccurate and noisy. In a typical discovery campaign, thousands of compounds may be synthesized and tested before design objectives are achieved, with many others ideated but deprioritized. These challenges are well-documented, but assessing potential remedies has been difficult. We introduce , a framework for modeling the stochastic process of drug discovery. Emulating biochemical assays with realistic surrogate models, we simulate the progression from weak hits to sub-micromolar leads with viable ADME. We use this testbed to examine how different ideation, scoring, and decision-making strategies impact statistical measures of utility, such as the probability of program success within predefined budgets and the expected costs to achieve target candidate profile (TCP) goals. We also assess the influence of affinity model inaccuracy, chemical creativity, batch size, and multi-step reasoning. Our findings suggest that reducing affinity model inaccuracy from 2 to 0.5 pIC50 units improves budget-constrained success rates tenfold. DrugGym represents a realistic testbed for machine learning methods applied to the hit-to-lead phase. Source code is available at www.drug-gym.org.

摘要

药物发现是随机的。候选化合物在满足设计目标方面的有效性在事先是未知的，并且用于优先级排序的工具——预测模型和分析方法——不准确且存在噪声。在典型的发现活动中，在实现设计目标之前可能会合成和测试数千种化合物，还有许多其他化合物虽然被构思出来但被降低了优先级。这些挑战有充分的文献记载，但评估潜在的补救措施一直很困难。我们引入了DrugGym，这是一个用于对药物发现的随机过程进行建模的框架。我们用逼真的替代模型模拟生化分析，模拟从弱活性化合物到具有可行药物代谢动力学性质的亚微摩尔级先导化合物的进展。我们使用这个测试平台来研究不同的构思、评分和决策策略如何影响效用的统计指标，例如在预定义预算内项目成功的概率以及实现目标候选物概况（TCP）目标的预期成本。我们还评估了亲和力模型不准确、化学创造性、批量大小和多步推理的影响。我们的研究结果表明，将亲和力模型的不准确度从2个pIC50单位降低到0.5个pIC50单位，可使预算受限的成功率提高十倍。DrugGym为应用于从活性化合物到先导化合物阶段的机器学习方法提供了一个现实的测试平台。源代码可在www.drug-gym.org获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b813/11160604/93fe30967091/nihpp-2024.05.28.596296v1-f0001.jpg

相似文献

DrugGym: A testbed for the economics of autonomous drug discovery.DrugGym：自主药物研发经济学的试验平台。

bioRxiv. 2024 Jun 2:2024.05.28.596296. doi: 10.1101/2024.05.28.596296.

Artificial Intelligence and Machine Learning for Lead-to-Candidate Decision-Making and Beyond.人工智能和机器学习在候选决策中的应用及未来发展

Annu Rev Pharmacol Toxicol. 2023 Jan 20;63:77-97. doi: 10.1146/annurev-pharmtox-051921-023255. Epub 2022 Jun 9.

Machine intelligence-driven framework for optimized hit selection in virtual screening.用于虚拟筛选中优化命中物选择的机器智能驱动框架。

J Cheminform. 2022 Jul 22;14(1):48. doi: 10.1186/s13321-022-00630-7.

Tuberculosis结核病

Safety screening in early drug discovery: An optimized assay panel.早期药物发现中的安全性筛选：优化的检测组合

J Pharmacol Toxicol Methods. 2019 Sep-Oct;99:106609. doi: 10.1016/j.vascn.2019.106609. Epub 2019 Jul 5.

GeneralizedDTA: combining pre-training and multi-task learning to predict drug-target binding affinity for unknown drug discovery.通用 DTA：结合预训练和多任务学习，预测未知药物发现的药物-靶标结合亲和力。

BMC Bioinformatics. 2022 Sep 7;23(1):367. doi: 10.1186/s12859-022-04905-6.

Planning Implications Related to Sterilization-Sensitive Science Investigations Associated with Mars Sample Return (MSR).与火星样本返回（MSR）相关的对灭菌敏感的科学研究的规划意义。

Astrobiology. 2022 Jun;22(S1):S112-S164. doi: 10.1089/AST.2021.0113. Epub 2022 May 19.

Modeling the value of predictive affinity scoring in preclinical drug discovery.预测亲和力评分在临床前药物发现中的价值建模。

Curr Opin Struct Biol. 2018 Oct;52:103-110. doi: 10.1016/j.sbi.2018.09.002. Epub 2018 Oct 12.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Applications of high-throughput ADME in drug discovery.高通量药物吸收、分布、代谢和排泄（ADME）在药物发现中的应用。

Curr Opin Chem Biol. 2004 Jun;8(3):339-45. doi: 10.1016/j.cbpa.2004.04.015.

本文引用的文献

Mothra: Multiobjective Molecular Generation Using Monte Carlo Tree Search. mothra：基于蒙特卡洛树搜索的多目标分子生成

J Chem Inf Model. 2024 Oct 14;64(19):7291-7302. doi: 10.1021/acs.jcim.4c00759. Epub 2024 Sep 25.

Are we fitting data or noise? Analysing the predictive power of commonly used datasets in drug-, materials-, and molecular-discovery.我们是在拟合数据还是噪声？分析药物、材料和分子发现中常用数据集的预测能力。

Faraday Discuss. 2025 Jan 14;256(0):304-321. doi: 10.1039/d4fd00091a.

Chemical language modeling with structured state space sequence models.基于结构化状态空间序列模型的化学语言建模。

Nat Commun. 2024 Jul 22;15(1):6176. doi: 10.1038/s41467-024-50469-9.

An algorithmic framework for synthetic cost-aware decision making in molecular design.分子设计中用于合成成本感知决策的算法框架。

Nat Comput Sci. 2024 Jun;4(6):440-450. doi: 10.1038/s43588-024-00639-y. Epub 2024 Jun 17.

AiZynthFinder 4.0: developments based on learnings from 3 years of industrial application.艾辛思寻径器4.0：基于三年工业应用经验的发展成果

J Cheminform. 2024 May 23;16(1):57. doi: 10.1186/s13321-024-00860-x.

Machine Learning Assisted Hit Prioritization for High Throughput Screening in Drug Discovery.机器学习辅助药物发现高通量筛选中的活性化合物优先级排序

ACS Cent Sci. 2024 Mar 15;10(4):823-832. doi: 10.1021/acscentsci.3c01517. eCollection 2024 Apr 24.

AiZynth impact on medicinal chemistry practice at AstraZeneca.爱生特（AiZynth）对阿斯利康药物化学实践的影响。

RSC Med Chem. 2024 Feb 16;15(4):1085-1095. doi: 10.1039/d3md00651d. eCollection 2024 Apr 24.

Augmenting DMTA using predictive AI modelling at AstraZeneca.在阿斯利康使用预测性 AI 模型增强 DMTA。

Drug Discov Today. 2024 Apr;29(4):103945. doi: 10.1016/j.drudis.2024.103945. Epub 2024 Mar 8.

Target-specific novel molecules with their recipe: Incorporating synthesizability in the design process.具有其配方的靶向特异性新型分子：在设计过程中融入可合成性。

J Mol Graph Model. 2024 Jun;129:108734. doi: 10.1016/j.jmgm.2024.108734. Epub 2024 Feb 28.

The Rise of Boron-Containing Compounds: Advancements in Synthesis, Medicinal Chemistry, and Emerging Pharmacology.含硼化合物的兴起：合成、药物化学和新兴药理学的进展。

Chem Rev. 2024 Mar 13;124(5):2441-2511. doi: 10.1021/acs.chemrev.3c00663. Epub 2024 Feb 21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

DrugGym：自主药物研发经济学的试验平台。

DrugGym: A testbed for the economics of autonomous drug discovery.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献