Genedata AG, Basel, Switzerland.
Genedata Inc., Lexington, MA, USA.
SLAS Discov. 2020 Aug;25(7):812-821. doi: 10.1177/2472555220918837. Epub 2020 May 20.
Drug discovery programs are moving increasingly toward phenotypic imaging assays to model disease-relevant pathways and phenotypes in vitro. These assays offer richer information than target-optimized assays by investigating multiple cellular pathways simultaneously and producing multiplexed readouts. However, extracting the desired information from complex image data poses significant challenges, preventing broad adoption of more sophisticated phenotypic assays. Deep learning-based image analysis can address these challenges by reducing the effort required to analyze large volumes of complex image data at a quality and speed adequate for routine phenotypic screening in pharmaceutical research. However, while general purpose deep learning frameworks are readily available, they are not readily applicable to images from automated microscopy. During the past 3 years, we have optimized deep learning networks for this type of data and validated the approach across diverse assays with several industry partners. From this work, we have extracted five essential design principles that we believe should guide deep learning-based analysis of high-content images and multiparameter data: (1) insightful data representation, (2) automation of training, (3) multilevel quality control, (4) knowledge embedding and transfer to new assays, and (5) enterprise integration. We report a new deep learning-based software that embodies these principles, Genedata Imagence, which allows screening scientists to reliably detect stable endpoints for primary drug response, assess toxicity and safety-relevant effects, and discover new phenotypes and compound classes. Furthermore, we show how the software retains expert knowledge from its training on a particular assay and successfully reapplies it to different, novel assays in an automated fashion.
药物发现计划越来越倾向于采用表型成像测定法,以在体外模拟与疾病相关的途径和表型。这些测定法通过同时研究多个细胞途径并产生多重读出值,提供比针对目标优化的测定法更丰富的信息。然而,从复杂的图像数据中提取所需的信息仍然面临着重大的挑战,这阻碍了更复杂的表型测定法的广泛应用。基于深度学习的图像分析可以通过减少分析大量复杂图像数据所需的工作量来应对这些挑战,这些数据的质量和速度足以满足制药研究中常规表型筛选的要求。然而,虽然通用的深度学习框架已经存在,但它们并不适用于自动化显微镜拍摄的图像。在过去的 3 年中,我们已经针对这类数据对深度学习网络进行了优化,并与几家行业合作伙伴在各种测定法中验证了该方法。从这项工作中,我们总结了五个关键的设计原则,我们认为这些原则应该指导基于深度学习的高内涵图像和多参数数据分析:(1)有见地的数据表示,(2)训练的自动化,(3)多层次质量控制,(4)知识嵌入和向新测定法的转移,以及(5)企业集成。我们报告了一个新的基于深度学习的软件,即 Genedata Imagence,它可以让筛选科学家可靠地检测到原发性药物反应的稳定终点,评估毒性和安全性相关的影响,并发现新的表型和化合物类别。此外,我们还展示了该软件如何保留其在特定测定法训练中获得的专家知识,并以自动化的方式成功地将其应用于不同的、新颖的测定法。