Suppr超能文献

高通量显微镜图像和卷积网络在生物测定中的精确预测。

Accurate Prediction of Biological Assays with High-Throughput Microscopy Images and Convolutional Networks.

机构信息

LIT AI Lab & Institute for Machine Learning , Johannes Kepler University , Linz 4040 , Austria.

Bayer AG, Berlin 13353 , Germany.

出版信息

J Chem Inf Model. 2019 Mar 25;59(3):1163-1171. doi: 10.1021/acs.jcim.8b00670. Epub 2019 Mar 6.

Abstract

Predicting the outcome of biological assays based on high-throughput imaging data is a highly promising task in drug discovery since it can tremendously increase hit rates and suggest novel chemical scaffolds. However, end-to-end learning with convolutional neural networks (CNNs) has not been assessed for the task biological assay prediction despite the success of these networks at visual recognition. We compared several CNNs trained directly on high-throughput imaging data to a) CNNs trained on cell-centric crops and to b) the current state-of-the-art: fully connected networks trained on precalculated morphological cell features. The comparison was performed on the Cell Painting data set, the largest publicly available data set of microscopic images of cells with approximately 30,000 compound treatments. We found that CNNs perform significantly better at predicting the outcome of assays than fully connected networks operating on precomputed morphological features of cells. Surprisingly, the best performing method could predict 32% of the 209 biological assays at high predictive performance (AUC > 0.9) indicating that the cell morphology changes contain a large amount of information about compound activities. Our results suggest that many biological assays could be replaced by high-throughput imaging together with convolutional neural networks and that the costly cell segmentation and feature extraction step can be replaced by convolutional neural networks.

摘要

基于高通量成像数据预测生物测定结果是药物发现中一项极有前途的任务,因为它可以极大地提高命中率并提出新的化学结构。然而,尽管卷积神经网络(CNN)在视觉识别方面取得了成功,但它们在生物测定预测方面的端到端学习尚未得到评估。我们比较了直接在高通量成像数据上训练的几种 CNN,a)在细胞中心作物上训练的 CNN,以及 b)当前的最先进技术:基于预先计算的形态细胞特征训练的全连接网络。比较是在 Cell Painting 数据集上进行的,该数据集是公开的最大的细胞显微镜图像数据集,包含约 30,000 种化合物处理。我们发现,CNN 在预测测定结果方面的表现明显优于基于细胞形态学特征的全连接网络。令人惊讶的是,表现最好的方法可以以高预测性能(AUC > 0.9)预测 209 种生物学测定中的 32%,这表明细胞形态变化包含了关于化合物活性的大量信息。我们的结果表明,许多生物学测定可以用高通量成像和卷积神经网络来替代,并且昂贵的细胞分割和特征提取步骤可以用卷积神经网络来替代。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验