Machine learning group, Technische Universität Berlin, 10623, Berlin, Germany.
BIFOLD - Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
Sci Rep. 2024 Oct 23;14(1):24988. doi: 10.1038/s41598-024-75256-w.
In this paper we present a deep learning segmentation approach to classify and quantify the two most prevalent primary liver cancers - hepatocellular carcinoma and intrahepatic cholangiocarcinoma - from hematoxylin and eosin (H&E) stained whole slide images. While semantic segmentation of medical images typically requires costly pixel-level annotations by domain experts, there often exists additional information which is routinely obtained in clinical diagnostics but rarely utilized for model training. We propose to leverage such weak information from patient diagnoses by deriving complementary labels that indicate to which class a sample cannot belong to. To integrate these labels, we formulate a complementary loss for segmentation. Motivated by the medical application, we demonstrate for general segmentation tasks that including additional patches with solely weak complementary labels during model training can significantly improve the predictive performance and robustness of a model. On the task of diagnostic differentiation between hepatocellular carcinoma and intrahepatic cholangiocarcinoma, we achieve a balanced accuracy of 0.91 (CI 95%: 0.86-0.95) at case level for 165 hold-out patients. Furthermore, we also show that leveraging complementary labels improves the robustness of segmentation and increases performance at case level.
在本文中,我们提出了一种深度学习分割方法,用于从苏木精和伊红(H&E)染色的全幻灯片图像中分类和量化两种最常见的原发性肝癌 - 肝细胞癌和肝内胆管癌。虽然医学图像的语义分割通常需要领域专家进行昂贵的像素级注释,但通常存在其他经常在临床诊断中获得但很少用于模型训练的信息。我们建议通过从患者诊断中得出补充标签来利用这种弱信息,该标签表明样本不属于哪一类。为了整合这些标签,我们为分割制定了补充损失。受医学应用的启发,我们证明了对于一般分割任务,在模型训练期间仅包含具有弱互补标签的附加补丁可以显著提高模型的预测性能和鲁棒性。在肝细胞癌和肝内胆管癌的诊断区分任务中,我们在 165 个留作测试的患者中实现了病例水平的平衡准确率为 0.91(95%CI:0.86-0.95)。此外,我们还表明,利用补充标签可以提高分割的鲁棒性并提高病例水平的性能。