Suppr超能文献

用于计算病理学中通用核分割的数据集和技术。

A Dataset and a Technique for Generalized Nuclear Segmentation for Computational Pathology.

出版信息

IEEE Trans Med Imaging. 2017 Jul;36(7):1550-1560. doi: 10.1109/TMI.2017.2677499. Epub 2017 Mar 6.

Abstract

Nuclear segmentation in digital microscopic tissue images can enable extraction of high-quality features for nuclear morphometrics and other analysis in computational pathology. Conventional image processing techniques, such as Otsu thresholding and watershed segmentation, do not work effectively on challenging cases, such as chromatin-sparse and crowded nuclei. In contrast, machine learning-based segmentation can generalize across various nuclear appearances. However, training machine learning algorithms requires data sets of images, in which a vast number of nuclei have been annotated. Publicly accessible and annotated data sets, along with widely agreed upon metrics to compare techniques, have catalyzed tremendous innovation and progress on other image classification problems, particularly in object recognition. Inspired by their success, we introduce a large publicly accessible data set of hematoxylin and eosin (H&E)-stained tissue images with more than 21000 painstakingly annotated nuclear boundaries, whose quality was validated by a medical doctor. Because our data set is taken from multiple hospitals and includes a diversity of nuclear appearances from several patients, disease states, and organs, techniques trained on it are likely to generalize well and work right out-of-the-box on other H&E-stained images. We also propose a new metric to evaluate nuclear segmentation results that penalizes object- and pixel-level errors in a unified manner, unlike previous metrics that penalize only one type of error. We also propose a segmentation technique based on deep learning that lays a special emphasis on identifying the nuclear boundaries, including those between the touching or overlapping nuclei, and works well on a diverse set of test images.

摘要

细胞核的分割可以从数字显微镜组织图像中提取高质量的特征,用于计算病理学中的核形态计量学和其他分析。传统的图像处理技术,如 Otsu 阈值分割和分水岭分割,在处理具有挑战性的情况时效果不佳,例如染色质稀疏和拥挤的细胞核。相比之下,基于机器学习的分割可以跨各种核外观进行泛化。然而,训练机器学习算法需要具有大量细胞核注释的图像数据集。公开可用的带注释数据集以及广泛认可的比较技术的指标,极大地推动了其他图像分类问题的创新和进展,特别是在对象识别方面。受其成功的启发,我们引入了一个大型的公开可用的苏木精和伊红(H&E)染色组织图像数据集,其中包含超过 21000 个经过精心注释的细胞核边界,其质量由一名医生进行了验证。由于我们的数据集来自多个医院,并且包含来自多个患者、疾病状态和器官的多种核外观,因此在其上训练的技术很可能会很好地泛化,并在其他 H&E 染色图像上直接使用。我们还提出了一种新的核分割评估指标,该指标以统一的方式惩罚对象级和像素级的错误,与仅惩罚一种类型错误的先前指标不同。我们还提出了一种基于深度学习的分割技术,特别强调识别细胞核边界,包括那些相互接触或重叠的细胞核边界,并且在各种测试图像上表现良好。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验