Zhang Hongtai, Liu Zaiyi, Song Mingli, Lu Cheng
School of Computer and Cyber Sciences, Communication University of China, Beijing 100024, China.
Department of Radiology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences),Southern Medical University, Guangzhou 510080, China.
J Pathol Inform. 2023 Feb 16;14:100302. doi: 10.1016/j.jpi.2023.100302. eCollection 2023.
Training a robust cancer diagnostic or prognostic artificial intelligent model using histology images requires a large number of representative cases with labels or annotations, which are difficult to obtain. The histology snapshots available in published papers or case reports can be used to enrich the training dataset. However, the magnifications of these invaluable snapshots are generally unknown, which limits their usage. Therefore, a robust magnification predictor is required for utilizing those diverse snapshot repositories consisting of different diseases. This paper presents a magnification prediction model named Hagnifinder for H&E-stained histological images.
Hagnifinder is a regression model based on a modified convolutional neural network (CNN) that contains 3 modules: Feature Extraction Module, Regression Module, and Adaptive Scaling Module (ASM). In the training phase, the Feature Extraction Module first extracts the image features. Secondly, the ASM is proposed to address the learned feature values uneven distribution problem. Finally, the Regression Module estimates the mapping between the regularized extracted features and the magnifications. We construct a new dataset for training a robust model, named Hagni40, consisting of 94 643 H&E-stained histology image patches at 40 different magnifications of 13 types of cancer based on The Cancer Genome Atlas. To verify the performance of the Hagnifinder, we measure the accuracy of the predictions by setting the maximum allowable difference values (0.5, 1, and 5) between the predicted magnification and the actual magnification. We compare Hagnifinder with state-of-the-art methods on a public dataset BreakHis and the Hagni40.
The Hagnifinder provides consistent prediction accuracy, with a mean accuracy of 98.9%, across 40 different magnifications and 13 different cancer types when Resnet50 is used as the feature extractor. Compared with the state-of-the-art methods focusing on 4-5 levels of magnification classification, the Hagnifinder achieve the best and most comparable performance in the BreakHis and Hagni40 datasets.
The experimental results suggest that Hagnifinder can be a valuable tool for predicting the associated magnification of any given histology image.
使用组织学图像训练强大的癌症诊断或预后人工智能模型需要大量带有标签或注释的代表性病例,而这些病例很难获得。已发表论文或病例报告中的组织学快照可用于扩充训练数据集。然而,这些宝贵快照的放大倍数通常未知,这限制了它们的使用。因此,需要一个强大的放大倍数预测器来利用那些包含不同疾病的多样快照库。本文提出了一种用于苏木精-伊红(H&E)染色组织学图像的放大倍数预测模型Hagnifinder。
Hagnifinder是一个基于改进卷积神经网络(CNN)的回归模型,包含3个模块:特征提取模块、回归模块和自适应缩放模块(ASM)。在训练阶段,特征提取模块首先提取图像特征。其次,提出ASM来解决学习到的特征值分布不均的问题。最后,回归模块估计正则化提取特征与放大倍数之间的映射。我们基于癌症基因组图谱构建了一个新的数据集用于训练强大的模型,名为Hagni40,它由13种癌症在40种不同放大倍数下的94643个H&E染色组织学图像块组成。为了验证Hagnifinder的性能,我们通过设置预测放大倍数与实际放大倍数之间的最大允许差值(0.5、1和5)来衡量预测的准确性。我们在公共数据集BreakHis和Hagni40上,将Hagnifinder与最先进的方法进行比较。
当使用Resnet50作为特征提取器时,Hagnifinder在40种不同放大倍数和13种不同癌症类型上提供了一致的预测准确性,平均准确率为98.9%。与专注于4 - 5个放大倍数分类水平的最先进方法相比,Hagnifinder在BreakHis和Hagni40数据集中实现了最佳且最具可比性的性能。
实验结果表明,Hagnifinder可以成为预测任何给定组织学图像相关放大倍数的有价值工具。