Kochetov Bogdan, Bell Phoenix, Garcia Paulo S, Shalaby Akram S, Raphael Rebecca, Raymond Benjamin, Leibowitz Brian J, Schoedel Karen, Brand Rhonda M, Brand Randall E, Yu Jian, Zhang Lin, Diergaarde Brenda, Schoen Robert E, Singhi Aatur, Uttam Shikhar
Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA.
UPMC Hillman Cancer Center, Pittsburgh, PA, USA.
bioRxiv. 2024 Apr 23:2023.11.13.566842. doi: 10.1101/2023.11.13.566842.
Multiplexed imaging technologies have made it possible to interrogate complex tumor microenvironments at sub-cellular resolution within their native spatial context. However, proper quantification of this complexity requires the ability to easily and accurately segment cells into their sub-cellular compartments. Within the supervised learning paradigm, deep learning based segmentation methods demonstrating human level performance have emerged. However, limited work has been done in developing such generalist methods within the label-free unsupervised context. Here we present an unsupervised segmentation (UNSEG) method that achieves deep learning level performance without requiring any training data. UNSEG leverages a Bayesian-like framework and the specificity of nucleus and cell membrane markers to construct an a probability estimate of each pixel belonging to the nucleus, cell membrane, or background. It uses this estimate to segment each cell into its nuclear and cell-membrane compartments. We show that UNSEG is more internally consistent and better at generalizing to the complexity of tissue morphology than current deep learning methods. This allows UNSEG to unambiguously identify the cytoplasmic compartment of a cell, which we employ to demonstrate its use in an exemplar biological scenario. Within the UNSEG framework, we also introduce a new perturbed watershed algorithm capable of stably and automatically segmenting a cluster of cell nuclei into individual cell nuclei that increases the accuracy of classical watershed. Perturbed watershed can also be used as a standalone algorithm that researchers can incorporate within their supervised or unsupervised learning approaches to extend classical watershed, particularly in the multiplexed imaging context. Finally, as part of developing UNSEG, we have generated a high-quality annotated gastrointestinal tissue (GIT) dataset, which we anticipate will be useful for the broader research community. We demonstrate the efficacy of UNSEG on the GIT dataset, publicly available datasets, and on a range of practical scenarios. In these contexts, we also discuss the possibility of bias inherent in quantification of segmentation accuracy based on score. Segmentation, despite its long antecedents, remains a challenging problem, particularly in the context of tissue samples. UNSEG, an easy-to-use algorithm, provides an unsupervised approach to overcome this bottleneck, and as we discuss, can help improve deep learning based segmentation methods by providing a bridge between unsupervised and supervised learning paradigms.
多重成像技术使得在天然空间背景下以亚细胞分辨率研究复杂的肿瘤微环境成为可能。然而,要对这种复杂性进行恰当的量化,就需要有能力轻松且准确地将细胞分割成其亚细胞组分。在监督学习范式中,已经出现了基于深度学习且表现出人类水平性能的分割方法。然而,在无标记的无监督背景下开发此类通用方法的工作还很有限。在此,我们提出一种无监督分割(UNSEG)方法,该方法无需任何训练数据就能实现深度学习水平的性能。UNSEG利用类似贝叶斯的框架以及细胞核和细胞膜标记物的特异性,构建每个像素属于细胞核、细胞膜或背景的概率估计。它利用这个估计将每个细胞分割成其细胞核和细胞膜组分。我们表明,与当前的深度学习方法相比,UNSEG在内部一致性方面更强,在概括组织形态的复杂性方面表现更好。这使得UNSEG能够明确识别细胞的细胞质组分,我们利用这一点在一个典型的生物学场景中展示其用途。在UNSEG框架内,我们还引入了一种新的扰动分水岭算法,该算法能够稳定且自动地将一群细胞核分割成单个细胞核,提高了经典分水岭算法的准确性。扰动分水岭算法也可以用作一种独立算法,研究人员可以将其纳入他们的监督或无监督学习方法中以扩展经典分水岭算法,特别是在多重成像背景下。最后,作为开发UNSEG的一部分,我们生成了一个高质量的注释胃肠道组织(GIT)数据集,我们预计它将对更广泛的研究群体有用。我们在GIT数据集、公开可用数据集以及一系列实际场景中展示了UNSEG的有效性。在这些背景下,我们还讨论了基于分数的分割准确性量化中固有偏差的可能性。分割尽管有悠久的历史,但仍然是一个具有挑战性的问题,特别是在组织样本的背景下。UNSEG是一种易于使用的算法,提供了一种无监督方法来克服这一瓶颈,并且如我们所讨论的,通过在无监督和监督学习范式之间架起一座桥梁,它可以帮助改进基于深度学习的分割方法。