van Dijk Robert, Arevalo John, Babadi Mehrtash, Carpenter Anne E, Singh Shantanu
CellVoyant Technologies.
Broad Institute of MIT and Harvard.
bioRxiv. 2024 Jul 31:2023.11.14.567038. doi: 10.1101/2023.11.14.567038.
Image-based cell profiling is a powerful tool that compares perturbed cell populations by measuring thousands of single-cell features and summarizing them into profiles. Typically a sample is represented by averaging across cells, but this fails to capture the heterogeneity within cell populations. We introduce CytoSummaryNet: a Deep Sets-based approach that improves mechanism of action prediction by 30-68% in mean average precision compared to average profiling on a public dataset. CytoSummaryNet uses self-supervised contrastive learning in a multiple-instance learning framework, providing an easier-to-apply method for aggregating single-cell feature data than previously published strategies. Interpretability analysis suggests that the model achieves this improvement by downweighting small mitotic cells or those with debris and prioritizing large uncrowded cells. The approach requires only perturbation labels for training, which are readily available in all cell profiling datasets. CytoSummaryNet offers a straightforward post-processing step for single-cell profiles that can significantly boost retrieval performance on image-based profiling datasets.
基于图像的细胞分析是一种强大的工具,它通过测量数千个单细胞特征并将其汇总成概况来比较受干扰的细胞群体。通常,一个样本是通过对细胞进行平均来表示的,但这无法捕捉细胞群体中的异质性。我们引入了CytoSummaryNet:一种基于深度集的方法,与在公共数据集上进行平均分析相比,其在平均精度方面将作用机制预测提高了30%-68%。CytoSummaryNet在多实例学习框架中使用自监督对比学习,为聚合单细胞特征数据提供了一种比以前发表的策略更易于应用的方法。可解释性分析表明,该模型通过降低小的有丝分裂细胞或有碎片的细胞的权重,并优先考虑大的、不拥挤的细胞来实现这种改进。该方法在训练时只需要扰动标签,而这些标签在所有细胞分析数据集中都很容易获得。CytoSummaryNet为单细胞概况提供了一个简单的后处理步骤,可以显著提高基于图像的分析数据集的检索性能。