利用深度学习对大规模癌症组织病理学图像数据集进行自动化注释。

Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany.

Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany.

Histopathology. 2024 Jun;84(7):1139-1153. doi: 10.1111/his.15159. Epub 2024 Feb 26.

BACKGROUND

Artificial intelligence (AI) has numerous applications in pathology, supporting diagnosis and prognostication in cancer. However, most AI models are trained on highly selected data, typically one tissue slide per patient. In reality, especially for large surgical resection specimens, dozens of slides can be available for each patient. Manually sorting and labelling whole-slide images (WSIs) is a very time-consuming process, hindering the direct application of AI on the collected tissue samples from large cohorts. In this study we addressed this issue by developing a deep-learning (DL)-based method for automatic curation of large pathology datasets with several slides per patient.

METHODS

We collected multiple large multicentric datasets of colorectal cancer histopathological slides from the United Kingdom (FOXTROT, N = 21,384 slides; CR07, N = 7985 slides) and Germany (DACHS, N = 3606 slides). These datasets contained multiple types of tissue slides, including bowel resection specimens, endoscopic biopsies, lymph node resections, immunohistochemistry-stained slides, and tissue microarrays. We developed, trained, and tested a deep convolutional neural network model to predict the type of slide from the slide overview (thumbnail) image. The primary statistical endpoint was the macro-averaged area under the receiver operating curve (AUROCs) for detection of the type of slide.

RESULTS

In the primary dataset (FOXTROT), with an AUROC of 0.995 [95% confidence interval [CI]: 0.994-0.996] the algorithm achieved a high classification performance and was able to accurately predict the type of slide from the thumbnail image alone. In the two external test cohorts (CR07, DACHS) AUROCs of 0.982 [95% CI: 0.979-0.985] and 0.875 [95% CI: 0.864-0.887] were observed, which indicates the generalizability of the trained model on unseen datasets. With a confidence threshold of 0.95, the model reached an accuracy of 94.6% (7331 classified cases) in CR07 and 85.1% (2752 classified cases) for the DACHS cohort.

CONCLUSION

Our findings show that using the low-resolution thumbnail image is sufficient to accurately classify the type of slide in digital pathology. This can support researchers to make the vast resource of existing pathology archives accessible to modern AI models with only minimal manual annotations.

背景

人工智能（AI）在病理学中有许多应用，可支持癌症的诊断和预后。然而，大多数 AI 模型都是基于高度选择的数据进行训练的，通常每个患者只有一个组织切片。实际上，特别是对于大型手术切除标本，每个患者可能有几十个切片。手动对全切片图像（WSI）进行分类和标记是一个非常耗时的过程，这阻碍了 AI 在大型队列收集的组织样本上的直接应用。在这项研究中，我们通过开发一种基于深度学习（DL）的方法来解决这个问题，该方法用于自动整理每个患者有多个切片的大型病理学数据集。

方法

我们从英国（FOXTROT，N=21384 张切片；CR07，N=7985 张切片）和德国（DACHS，N=3606 张切片）收集了多个大型多中心结直肠癌组织学幻灯片数据集。这些数据集包含多种类型的组织切片，包括肠切除术标本、内镜活检、淋巴结切除、免疫组织化学染色切片和组织微阵列。我们开发、训练和测试了一种深度卷积神经网络模型，以从幻灯片概述（缩略图）图像中预测幻灯片的类型。主要的统计终点是检测幻灯片类型的接收器工作特征曲线（ROC）下的面积的宏观平均值（AUROCs）。

结果

在主要数据集（FOXTROT）中，算法的 AUROC 为 0.995[95%置信区间（CI）：0.994-0.996]，表现出较高的分类性能，并且能够仅从缩略图图像准确预测幻灯片的类型。在两个外部测试队列（CR07、DACHS）中，观察到的 AUROCs 分别为 0.982[95%CI：0.979-0.985]和 0.875[95%CI：0.864-0.887]，表明训练模型在未见数据集上具有可推广性。在置信度阈值为 0.95 时，该模型在 CR07 中达到了 94.6%（7331 个分类病例）的准确率，在 DACHS 队列中达到了 85.1%（2752 个分类病例）。

结论

我们的研究结果表明，使用低分辨率缩略图图像足以准确分类数字病理学中的幻灯片类型。这可以支持研究人员使现有的大量病理学档案对现代 AI 模型开放，而只需进行最小限度的手动注释。

相似文献

Automated curation of large-scale cancer histopathology image datasets using deep learning.

Histopathology. 2024 Jun;84(7):1139-1153. doi: 10.1111/his.15159. Epub 2024 Feb 26.

Development and validation of artificial intelligence-based prescreening of large-bowel biopsies taken in the UK and Portugal: a retrospective cohort study.

Lancet Digit Health. 2023 Nov;5(11):e786-e797. doi: 10.1016/S2589-7500(23)00148-6.

A promising deep learning-assistive algorithm for histopathological screening of colorectal cancer.

Sci Rep. 2022 Feb 9;12(1):2222. doi: 10.1038/s41598-022-06264-x.

Deep learning can predict lymph node status directly from histology in colorectal cancer.

Eur J Cancer. 2021 Nov;157:464-473. doi: 10.1016/j.ejca.2021.08.039. Epub 2021 Oct 11.

Operational greenhouse-gas emissions of deep learning in digital pathology: a modelling study.

Lancet Digit Health. 2024 Jan;6(1):e58-e69. doi: 10.1016/S2589-7500(23)00219-4. Epub 2023 Nov 22.

Weakly supervised annotation-free cancer detection and prediction of genotype in routine histopathology.

J Pathol. 2022 Jan;256(1):50-60. doi: 10.1002/path.5800. Epub 2021 Oct 22.

RegWSI: Whole slide image registration using combined deep feature- and intensity-based methods: Winner of the ACROBAT 2023 challenge.

Comput Methods Programs Biomed. 2024 Jun;250:108187. doi: 10.1016/j.cmpb.2024.108187. Epub 2024 Apr 22.

Utility of artificial intelligence with deep learning of hematoxylin and eosin-stained whole slide images to predict lymph node metastasis in T1 colorectal cancer using endoscopically resected specimens; prediction of lymph node metastasis in T1 colorectal cancer.

J Gastroenterol. 2022 Sep;57(9):654-666. doi: 10.1007/s00535-022-01894-4. Epub 2022 Jul 8.

HunCRC: annotated pathological slides to enhance deep learning applications in colorectal cancer screening.

Sci Data. 2022 Jun 28;9(1):370. doi: 10.1038/s41597-022-01450-y.

Equipping computational pathology systems with artifact processing pipelines: a showcase for computation and performance trade-offs.

BMC Med Inform Decis Mak. 2024 Oct 7;24(1):288. doi: 10.1186/s12911-024-02676-z.

引用本文的文献

Beyond Biomarkers: Machine Learning-Driven Multiomics for Personalized Medicine in Gastric Cancer.

J Pers Med. 2025 Apr 24;15(5):166. doi: 10.3390/jpm15050166.

Applications of artificial intelligence in digital pathology for gastric cancer.

Front Oncol. 2024 Oct 28;14:1437252. doi: 10.3389/fonc.2024.1437252. eCollection 2024.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Automated curation of large-scale cancer histopathology image datasets using deep learning.

Histopathology. 2024 Jun;84(7):1139-1153. doi: 10.1111/his.15159. Epub 2024 Feb 26.

Development and validation of artificial intelligence-based prescreening of large-bowel biopsies taken in the UK and Portugal: a retrospective cohort study.

Lancet Digit Health. 2023 Nov;5(11):e786-e797. doi: 10.1016/S2589-7500(23)00148-6.

A promising deep learning-assistive algorithm for histopathological screening of colorectal cancer.

Sci Rep. 2022 Feb 9;12(1):2222. doi: 10.1038/s41598-022-06264-x.

Deep learning can predict lymph node status directly from histology in colorectal cancer.

Eur J Cancer. 2021 Nov;157:464-473. doi: 10.1016/j.ejca.2021.08.039. Epub 2021 Oct 11.

Operational greenhouse-gas emissions of deep learning in digital pathology: a modelling study.

Lancet Digit Health. 2024 Jan;6(1):e58-e69. doi: 10.1016/S2589-7500(23)00219-4. Epub 2023 Nov 22.

Weakly supervised annotation-free cancer detection and prediction of genotype in routine histopathology.

J Pathol. 2022 Jan;256(1):50-60. doi: 10.1002/path.5800. Epub 2021 Oct 22.

RegWSI: Whole slide image registration using combined deep feature- and intensity-based methods: Winner of the ACROBAT 2023 challenge.

Comput Methods Programs Biomed. 2024 Jun;250:108187. doi: 10.1016/j.cmpb.2024.108187. Epub 2024 Apr 22.

J Gastroenterol. 2022 Sep;57(9):654-666. doi: 10.1007/s00535-022-01894-4. Epub 2022 Jul 8.

HunCRC: annotated pathological slides to enhance deep learning applications in colorectal cancer screening.

Sci Data. 2022 Jun 28;9(1):370. doi: 10.1038/s41597-022-01450-y.

Equipping computational pathology systems with artifact processing pipelines: a showcase for computation and performance trade-offs.

BMC Med Inform Decis Mak. 2024 Oct 7;24(1):288. doi: 10.1186/s12911-024-02676-z.

引用本文的文献

Beyond Biomarkers: Machine Learning-Driven Multiomics for Personalized Medicine in Gastric Cancer.

J Pers Med. 2025 Apr 24;15(5):166. doi: 10.3390/jpm15050166.

Applications of artificial intelligence in digital pathology for gastric cancer.

Front Oncol. 2024 Oct 28;14:1437252. doi: 10.3389/fonc.2024.1437252. eCollection 2024.

Automated curation of large-scale cancer histopathology image datasets using deep learning.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献