Rabal Obdulia, Link Wolfgang, Serelde Beatriz G, Bischoff James R, Oyarzabal Julen
Experimental Therapeutics Programme, Spanish National Cancer Research Centre (CNIO), Melchor Fernandez Almagro, 3, 28029 Madrid, Spain.
Mol Biosyst. 2010 Apr;6(4):711-20. doi: 10.1039/b919830j. Epub 2010 Jan 21.
Here we report the development and validation of a complete solution to manage and analyze the data produced by image-based phenotypic screening campaigns of small-molecule libraries. In one step initial crude images are analyzed for multiple cytological features, statistical analysis is performed and molecules that produce the desired phenotypic profile are identified. A naïve Bayes classifier, integrating chemical and phenotypic spaces, is built and utilized during the process to assess those images initially classified as "fuzzy"-an automated iterative feedback tuning. Simultaneously, all this information is directly annotated in a relational database containing the chemical data. This novel fully automated method was validated by conducting a re-analysis of results from a high-content screening campaign involving 33 992 molecules used to identify inhibitors of the PI3K/Akt signaling pathway. Ninety-two percent of confirmed hits identified by the conventional multistep analysis method were identified using this integrated one-step system as well as 40 new hits, 14.9% of the total, originally false negatives. Ninety-six percent of true negatives were properly recognized too. A web-based access to the database, with customizable data retrieval and visualization tools, facilitates the posterior analysis of annotated cytological features which allows identification of additional phenotypic profiles; thus, further analysis of original crude images is not required.
在此,我们报告了一种完整解决方案的开发与验证,该方案用于管理和分析小分子文库基于图像的表型筛选活动所产生的数据。在一个步骤中,对初始的原始图像进行多种细胞学特征分析,进行统计分析,并识别产生所需表型特征的分子。在此过程中构建并使用了一个整合化学和表型空间的朴素贝叶斯分类器,以评估那些最初被归类为“模糊”的图像——这是一种自动迭代反馈调整。同时,所有这些信息都直接注释在一个包含化学数据的关系数据库中。通过对一项涉及用于鉴定PI3K/Akt信号通路抑制剂的33992个分子的高内涵筛选活动结果进行重新分析,验证了这种新颖的全自动方法。使用这种集成的一步系统也鉴定出了传统多步分析方法所鉴定的92%的确认命中物,以及40个新的命中物,占总数的14.9%,这些原本是假阴性。96%的真阴性也被正确识别。通过基于网络访问数据库,并使用可定制的数据检索和可视化工具,便于对注释的细胞学特征进行后续分析,从而能够识别额外的表型特征;因此,无需对原始原始图像进行进一步分析。