Virdi Anish, Joglekar Ajit P
Department of Biophysics, University of Michigan.
Cell & Developmental Biology, University of Michigan Medical School.
bioRxiv. 2025 Jan 24:2025.01.23.634498. doi: 10.1101/2025.01.23.634498.
High throughput fluorescence microscopy is an essential tool in systems biological studies of eukaryotic cells. Its power can be fully realized when all cells in a field of view and the entire time series can be accurately localized and quantified. These tasks can be mapped to the common paradigm in computer vision: instance segmentation. Recently, supervised deep learning-based methods have become state-of-the-art for cellular instance segmentation. However, these methods require large amounts of high-quality training data. This requirement challenges our ability to train increasingly performant object detectors due to the limited availability of annotated training data, which is typically assembled via laborious hand annotation. Here, we present a generalizable method for generating large instance segmentation training datasets for tissue-culture cells in transmitted light microscopy images. We use datasets created by this method to train vision transformer (ViT) based Mask-RCNNs (Region-based Convolutional Neural Networks) that produce instance segmentations wherein cells are classified as "m-phase" (dividing) or "interphase" (non-dividing). While training these models, we also address the dataset class imbalance between m-phase and interphase cell annotations, which arises for biological reasons, using probabilistically weighted loss functions and partisan training data collection methods. We demonstrate the validity of these approaches by producing highly accurate object detectors that can serve as general tools for the segmentation and classification of morphologically diverse cells. Since the methodology depends only on generic cellular features, we hypothesize that it can be further generalized to most adherent tissue culture cell lines.
高通量荧光显微镜是真核细胞系统生物学研究中的重要工具。当视野中的所有细胞以及整个时间序列都能被准确地定位和定量时,其强大功能才能得以充分发挥。这些任务可以映射到计算机视觉中的常见范式:实例分割。最近,基于监督深度学习的方法已成为细胞实例分割的先进技术。然而,这些方法需要大量高质量的训练数据。由于注释训练数据的可用性有限(通常是通过费力的手工注释来收集),这一要求对我们训练性能日益提升的目标检测器的能力构成了挑战。在此,我们提出了一种可推广的方法,用于在透射光显微镜图像中生成用于组织培养细胞的大型实例分割训练数据集。我们使用通过该方法创建的数据集来训练基于视觉Transformer(ViT)的Mask-RCNN(基于区域的卷积神经网络),该网络生成的实例分割能将细胞分类为“M期”(分裂期)或“间期”(非分裂期)。在训练这些模型时,我们还使用概率加权损失函数和党派训练数据收集方法,解决了由于生物学原因导致的M期和间期细胞注释之间的数据集类别不平衡问题。我们通过生成高度准确的目标检测器来证明这些方法的有效性,这些检测器可作为对形态多样的细胞进行分割和分类的通用工具。由于该方法仅依赖于一般的细胞特征,我们推测它可以进一步推广到大多数贴壁组织培养细胞系。