School of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada.
Bioinformatics Graduate Program, University of British Columbia, Vancouver, Canada.
Nat Commun. 2024 May 10;15(1):3942. doi: 10.1038/s41467-024-48062-1.
In clinical oncology, many diagnostic tasks rely on the identification of cells in histopathology images. While supervised machine learning techniques necessitate the need for labels, providing manual cell annotations is time-consuming. In this paper, we propose a self-supervised framework (enVironment-aware cOntrastive cell represenTation learning: VOLTA) for cell representation learning in histopathology images using a technique that accounts for the cell's mutual relationship with its environment. We subject our model to extensive experiments on data collected from multiple institutions comprising over 800,000 cells and six cancer types. To showcase the potential of our proposed framework, we apply VOLTA to ovarian and endometrial cancers and demonstrate that our cell representations can be utilized to identify the known histotypes of ovarian cancer and provide insights that link histopathology and molecular subtypes of endometrial cancer. Unlike supervised models, we provide a framework that can empower discoveries without any annotation data, even in situations where sample sizes are limited.
在临床肿瘤学中,许多诊断任务依赖于组织病理学图像中细胞的识别。虽然监督机器学习技术需要标签,但手动细胞注释非常耗时。在本文中,我们提出了一种基于自监督的框架(enVironment-aware cOntrastive cell represenTation learning:VOLTA),用于使用一种考虑细胞与其环境相互关系的技术对组织病理学图像中的细胞进行表示学习。我们在从多个机构收集的数据上对我们的模型进行了广泛的实验,这些数据包含超过 80 万个细胞和六种癌症类型。为了展示我们提出的框架的潜力,我们将 VOLTA 应用于卵巢癌和子宫内膜癌,并证明我们的细胞表示可以用于识别卵巢癌的已知组织类型,并提供将组织病理学和子宫内膜癌的分子亚型联系起来的见解。与监督模型不同,我们提供了一个无需任何注释数据即可进行发现的框架,即使在样本量有限的情况下也是如此。