Gentile Francesco, Agrawal Vibudh, Hsing Michael, Ton Anh-Tien, Ban Fuqiang, Norinder Ulf, Gleave Martin E, Cherkasov Artem
Vancouver Prostate Centre, University of British Columbia, Vancouver, British Columbia V6H3Z6, Canada.
Swetox, Unit of Toxicology Sciences, Karolinska Institutet, Forskargatan 20, SE-151 36 Södertalje, Sweden.
ACS Cent Sci. 2020 Jun 24;6(6):939-949. doi: 10.1021/acscentsci.0c00229. Epub 2020 May 19.
Drug discovery is a rigorous process that requires billion dollars of investments and decades of research to bring a molecule "from bench to a bedside". While virtual docking can significantly accelerate the process of drug discovery, it ultimately lags the current rate of expansion of chemical databases that already exceed billions of molecular records. This recent surge of small molecules availability presents great drug discovery opportunities, but also demands much faster screening protocols. In order to address this challenge, we herein introduce Deep Docking (), a novel deep learning platform that is suitable for docking billions of molecular structures in a rapid, yet accurate fashion. The approach utilizes quantitative structure-activity relationship (QSAR) deep models trained on docking scores of subsets of a chemical library to approximate the docking outcome for yet unprocessed entries and, therefore, to remove unfavorable molecules in an iterative manner. The use of methodology in conjunction with the FRED docking program allowed rapid and accurate calculation of docking scores for 1.36 billion molecules from the ZINC15 library against 12 prominent target proteins and demonstrated up to 100-fold data reduction and 6000-fold enrichment of high scoring molecules (without notable loss of favorably docked entities). The protocol can readily be used in conjunction with any docking program and was made publicly available.
药物发现是一个严格的过程,需要数十亿美元的投资和数十年的研究才能将一种分子“从实验室带到病床边”。虽然虚拟对接可以显著加速药物发现过程,但它最终还是落后于目前化学数据库的扩展速度,这些数据库已经超过数十亿条分子记录。最近小分子可得性的激增带来了巨大的药物发现机会,但也需要更快的筛选方案。为了应对这一挑战,我们在此引入深度对接(Deep Docking),这是一种新颖的深度学习平台,适用于以快速而准确的方式对接数十亿个分子结构。该方法利用在化学库子集的对接分数上训练的定量构效关系(QSAR)深度模型,来近似未处理条目的对接结果,从而以迭代方式去除不利的分子。将该方法与FRED对接程序结合使用,可以快速准确地计算来自ZINC15库的13.6亿个分子与12种重要靶蛋白的对接分数,并证明数据减少高达100倍,高得分分子富集6000倍(有利对接实体无明显损失)。该协议可以很容易地与任何对接程序结合使用,并已公开提供。