Balan Jagadheshwar, McDonnell Shannon K, Fogarty Zachary, Larson Nicholas B
Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA.
J Pathol Inform. 2025 Apr 15;17:100443. doi: 10.1016/j.jpi.2025.100443. eCollection 2025 Apr.
Characterizing cellular composition in tissue samples offers fundamental insights into functional and biological processes. Understanding the abundance or lack of specific cell types, such as inflammatory cells in the context of microenvironments such as tumor can help guide disease progression and personalized medicine. Several clinical laboratory methods to characterize the cellular composition are limited by scalability and high-costs. Digitizing pathology slides and applying deep learning (DL) models have enabled efficient and cost-effective nuclei segmentation and cell type quantification; however, the DL-models are limited by their inability to segment specific cell types and specific models may be more effective than others at certain tasks. Consequently, there remains a need for methods that leverage the strengths of multiple models to efficiently integrate nuclei segmentation for various cell types. In this study, we propose a novel solution for integrating nuclei segmentation from multiple DL-methods on hematoxylin and eosin slides from 471 normal prostate samples and highlight the limitations of using a single DL-method. We validate the DL-derived cell type proportions, by comparing against estimates from a manual pathologist review and show that the integrated approach results in higher concordance over the individual models. We further validate the derived cell type proportions from the DL-methods by their ability to explain the variance of RNA gene expression. The integrated approach yields robust cell type proportions that explain the variance of the gene expression with 12% and 22% relative improvement than current state-of-the-art model and manual pathologist review, respectively. The subset of 403 genes with high explained variation (>30%) by epithelial proportion were significantly enriched for relevant biological pathways. These findings indicate that ensemble approaches to nuclei segmentation and cell-type classification may provide more accurate representations of cellular composition from digitized slides.
表征组织样本中的细胞组成可为功能和生物学过程提供基本见解。了解特定细胞类型的丰度或缺失情况,例如肿瘤等微环境中的炎性细胞,有助于指导疾病进展和个性化医疗。几种用于表征细胞组成的临床实验室方法受到可扩展性和高成本的限制。对病理切片进行数字化处理并应用深度学习(DL)模型能够实现高效且经济高效的细胞核分割和细胞类型定量;然而,DL模型存在无法分割特定细胞类型的局限性,并且特定模型在某些任务上可能比其他模型更有效。因此,仍然需要利用多种模型的优势来有效整合各种细胞类型的细胞核分割的方法。在本研究中,我们针对来自471个正常前列腺样本的苏木精和伊红染色切片,提出了一种整合多种DL方法进行细胞核分割的新解决方案,并突出了使用单一DL方法的局限性。我们通过与病理学家手动评估的估计值进行比较,验证了DL衍生的细胞类型比例,并表明整合方法比单个模型具有更高的一致性。我们还通过其解释RNA基因表达差异的能力,进一步验证了DL方法衍生的细胞类型比例。整合方法产生了稳健的细胞类型比例,分别比当前最先进的模型和病理学家手动评估解释基因表达差异的能力提高了12%和22%。上皮比例对403个具有高解释变异(>30%)的基因子集进行的相关生物学途径显著富集。这些发现表明,细胞核分割和细胞类型分类的集成方法可能能从数字化切片中更准确地呈现细胞组成。