Gerussi Alessio, Saldanha Oliver Lester, Cazzaniga Giorgio, Verda Damiano, Carrero Zunamys I, Engel Bastian, Taubert Richard, Bolis Francesca, Cristoferi Laura, Malinverno Federica, Colapietro Francesca, Akpinar Reha, Di Tommaso Luca, Terracciano Luigi, Lleo Ana, Viganó Mauro, Rigamonti Cristina, Cabibi Daniela, Calvaruso Vincenza, Gibilisco Fabio, Caldonazzi Nicoló, Valentino Alessandro, Ceola Stefano, Canini Valentina, Nofit Eugenia, Muselli Marco, Calderaro Julien, Tiniakos Dina, L'Imperio Vincenzo, Pagni Fabio, Zucchini Nicola, Invernizzi Pietro, Carbone Marco, Kather Jakob Nikolas
Division of Gastroenterology, Center for Autoimmune Liver Diseases, European Reference Network on Hepatological Diseases (ERN RARE-LIVER), Fondazione IRCCS San Gerardo dei Tintori, Monza, Italy.
Department of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy.
JHEP Rep. 2024 Aug 31;7(2):101198. doi: 10.1016/j.jhepr.2024.101198. eCollection 2025 Feb.
BACKGROUND & AIMS: Biliary abnormalities in autoimmune hepatitis (AIH) and interface hepatitis in primary biliary cholangitis (PBC) occur frequently, and misinterpretation may lead to therapeutic mistakes with a negative impact on patients. This study investigates the use of a deep learning (DL)-based pipeline for the diagnosis of AIH and PBC to aid differential diagnosis.
We conducted a multicenter study across six European referral centers, and built a library of digitized liver biopsy slides dating from 1997 to 2023. A training set of 354 cases (266 AIH and 102 PBC) and an external validation set of 92 cases (62 AIH and 30 PBC) were available for analysis. A novel DL model, the autoimmune liver neural estimator (ALNE), was trained on whole-slide images (WSIs) with H&E staining, without human annotations. The ALNE model was evaluated against clinico-pathological diagnoses and tested for interobserver variability among general pathologists.
The ALNE model demonstrated high accuracy in differentiating AIH from PBC, achieving an area under the receiver operating characteristic curve of 0.81 in external validation. Attention heatmaps showed that ALNE tends to focus more on areas with increased inflammation, associating such patterns predominantly with AIH. A multivariate explainable ML model revealed that PBC cases misclassified as AIH more often had ALP values between 1 × upper limit of normal (ULN) and 2 × ULN, coupled with AST values above 1 × ULN. Inconsistency among general pathologists was noticed when evaluating a random sample of the same cases (Fleiss's kappa value 0.09).
The ALNE model is the first system generating a quantitative and accurate differential diagnosis between cases with AIH or PBC.
This study demonstrates the significant potential of the autoimmune liver neural estimator model, a transformer-based deep learning system, in accurately distinguishing between autoimmune hepatitis and primary biliary cholangitis using digitized liver biopsy slides without human annotation. The scientific justification for this work lies in addressing the challenge of differentiating these conditions, which often present with overlapping features and can lead to therapeutic mistakes. In addition, there is need for quantitative assessment of information embedded in liver biopsies, which are currently evaluated on qualitative or semi-quantitative methods. The results of this study are crucial for pathologists, researchers, and clinicians, providing a reliable diagnostic tool that reduces interobserver variability and improves diagnostic accuracy of these conditions. Potential methodological limitations, such as the diversity in scanning techniques and slide colorations, were considered, ensuring the robustness and generalizability of the findings.
自身免疫性肝炎(AIH)中的胆汁异常以及原发性胆汁性胆管炎(PBC)中的界面性肝炎经常出现,误诊可能导致治疗失误,对患者产生负面影响。本研究调查了使用基于深度学习(DL)的流程来诊断AIH和PBC以辅助鉴别诊断。
我们在六个欧洲转诊中心开展了一项多中心研究,并建立了一个涵盖1997年至2023年的数字化肝活检玻片库。有一组354例的训练集(266例AIH和102例PBC)以及一组92例的外部验证集(62例AIH和30例PBC)可供分析。一种新型的DL模型,即自身免疫性肝病神经评估器(ALNE),在未经人工标注的苏木精-伊红(H&E)染色全切片图像(WSIs)上进行训练。将ALNE模型与临床病理诊断进行对比评估,并测试普通病理学家之间的观察者间变异性。
ALNE模型在区分AIH和PBC方面表现出高准确性,在外部验证中受试者操作特征曲线下面积达到0.81。注意力热图显示,ALNE倾向于更多地关注炎症增加的区域,且主要将此类模式与AIH相关联。一个多变量可解释机器学习模型显示,被误诊为AIH的PBC病例中,碱性磷酸酶(ALP)值更常介于1×正常上限(ULN)和2×ULN之间,同时天冬氨酸氨基转移酶(AST)值高于1×ULN。在评估相同病例的随机样本时注意到普通病理学家之间存在不一致性(弗莱iss卡方值为0.09)。
ALNE模型是首个能够对AIH或PBC病例进行定量且准确鉴别诊断的系统。
本研究证明了基于变压器的深度学习系统自身免疫性肝病神经评估器模型在使用未经人工标注的数字化肝活检玻片准确区分自身免疫性肝炎和原发性胆汁性胆管炎方面具有巨大潜力。这项工作的科学依据在于应对区分这些病症的挑战,这些病症常常具有重叠特征且可能导致治疗失误。此外,需要对肝活检中嵌入的信息进行定量评估,目前这些评估采用定性或半定量方法。本研究结果对病理学家、研究人员和临床医生至关重要,提供了一种可靠的诊断工具,可减少观察者间变异性并提高这些病症的诊断准确性。考虑到了潜在的方法学局限性,如扫描技术和玻片染色的多样性,确保了研究结果的稳健性和可推广性。