In-Silico Discovery, Janssen Research & Development, Pharmaceutical Companies of Johnson & Johnson, Beerse B-2340, Belgium.
Discovery Technology and Molecular Pharmacology, Janssen Research & Development, Pharmaceutical Companies of Johnson & Johnson, Beerse B-2340, Belgium.
Chem Res Toxicol. 2023 Jul 17;36(7):1028-1036. doi: 10.1021/acs.chemrestox.2c00404. Epub 2023 Jun 16.
The search for chemical hit material is a lengthy and increasingly expensive drug discovery process. To improve it, ligand-based quantitative structure-activity relationship models have been broadly applied to optimize primary and secondary compound properties. Although these models can be deployed as early as the stage of molecule design, they have a limited applicability domain─if the structures of interest differ substantially from the chemical space on which the model was trained, a reliable prediction will not be possible. Image-informed ligand-based models partly solve this shortcoming by focusing on the phenotype of a cell caused by small molecules, rather than on their structure. While this enables chemical diversity expansion, it limits the application to compounds physically available and imaged. Here, we employ an active learning approach to capitalize on both of these methods' strengths and boost the model performance of a mitochondrial toxicity assay (Glu/Gal). Specifically, we used a phenotypic Cell Painting screen to build a chemistry-independent model and adopted the results as the main factor in selecting compounds for experimental testing. With the additional Glu/Gal annotation for selected compounds we were able to dramatically improve the chemistry-informed ligand-based model with respect to the increased recognition of compounds from a 10% broader chemical space.
寻找化学命中物质是一个漫长而日益昂贵的药物发现过程。为了改进这一过程,基于配体的定量构效关系模型已被广泛应用于优化初级和次级化合物的性质。虽然这些模型可以在分子设计阶段尽早部署,但它们的适用范围有限——如果感兴趣的结构与模型训练的化学空间有很大差异,则不可能进行可靠的预测。基于图像的配体模型部分解决了这一缺点,它关注的是小分子引起的细胞表型,而不是它们的结构。虽然这可以实现化学多样性的扩展,但它将应用限制在物理上可获得和成像的化合物上。在这里,我们采用主动学习方法来充分利用这两种方法的优势,并提高线粒体毒性测定(Glu/Gal)的模型性能。具体来说,我们使用表型细胞绘画筛选来构建一个与化学无关的模型,并将结果作为选择化合物进行实验测试的主要因素。通过对选定化合物的额外 Glu/Gal 注释,我们能够显著提高基于配体的化学信息模型的性能,从而更好地识别来自更广泛的 10%化学空间的化合物。