Oestmann Paula M, Wang Clinton J, Savic Lynn J, Hamm Charlie A, Stark Sophie, Schobert Isabel, Gebauer Bernhard, Schlachter Todd, Lin MingDe, Weinreb Jeffrey C, Batra Ramesh, Mulligan David, Zhang Xuchen, Duncan James S, Chapiro Julius
Department of Radiology and Biomedical Imaging, Yale School of Medicine, 333 Cedar Street, New Haven, CT, 06520, USA.
Institute of Radiology, Berlin Institute of Health, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität, 10117, Berlin, Germany.
Eur Radiol. 2021 Jul;31(7):4981-4990. doi: 10.1007/s00330-020-07559-1. Epub 2021 Jan 6.
To train a deep learning model to differentiate between pathologically proven hepatocellular carcinoma (HCC) and non-HCC lesions including lesions with atypical imaging features on MRI.
This IRB-approved retrospective study included 118 patients with 150 lesions (93 (62%) HCC and 57 (38%) non-HCC) pathologically confirmed through biopsies (n = 72), resections (n = 29), liver transplants (n = 46), and autopsies (n = 3). Forty-seven percent of HCC lesions showed atypical imaging features (not meeting Liver Imaging Reporting and Data System [LI-RADS] criteria for definitive HCC/LR5). A 3D convolutional neural network (CNN) was trained on 140 lesions and tested for its ability to classify the 10 remaining lesions (5 HCC/5 non-HCC). Performance of the model was averaged over 150 runs with random sub-sampling to provide class-balanced test sets. A lesion grading system was developed to demonstrate the similarity between atypical HCC and non-HCC lesions prone to misclassification by the CNN.
The CNN demonstrated an overall accuracy of 87.3%. Sensitivities/specificities for HCC and non-HCC lesions were 92.7%/82.0% and 82.0%/92.7%, respectively. The area under the receiver operating curve was 0.912. CNN's performance was correlated with the lesion grading system, becoming less accurate the more atypical imaging features the lesions showed.
This study provides proof-of-concept for CNN-based classification of both typical- and atypical-appearing HCC lesions on multi-phasic MRI, utilizing pathologically confirmed lesions as "ground truth."
• A CNN trained on atypical appearing pathologically proven HCC lesions not meeting LI-RADS criteria for definitive HCC (LR5) can correctly differentiate HCC lesions from other liver malignancies, potentially expanding the role of image-based diagnosis in primary liver cancer with atypical features. • The trained CNN demonstrated an overall accuracy of 87.3% and a computational time of < 3 ms which paves the way for clinical application as a decision support instrument.
训练一个深度学习模型,以区分经病理证实的肝细胞癌(HCC)和非HCC病变,包括磁共振成像(MRI)上具有非典型影像特征的病变。
这项经机构审查委员会(IRB)批准的回顾性研究纳入了118例患者的150个病变(93个(62%)HCC和57个(38%)非HCC),这些病变通过活检(n = 72)、手术切除(n = 29)、肝移植(n = 46)和尸检(n = 3)进行病理确诊。47%的HCC病变表现出非典型影像特征(不符合肝脏影像报告和数据系统[LI-RADS]中确定性HCC/LR5的标准)。在140个病变上训练了一个三维卷积神经网络(CNN),并测试其对其余10个病变(5个HCC/5个非HCC)进行分类的能力。通过随机子采样在150次运行中对模型的性能进行平均,以提供类别平衡的测试集。开发了一种病变分级系统,以证明非典型HCC与容易被CNN误分类的非HCC病变之间的相似性。
CNN的总体准确率为87.3%。HCC和非HCC病变的敏感度/特异度分别为92.7%/82.0%和82.0%/92.7%。受试者操作特征曲线下面积为0.912。CNN的性能与病变分级系统相关,病变表现出的非典型影像特征越多,其准确性越低。
本研究为基于CNN对多期MRI上典型和非典型表现的HCC病变进行分类提供了概念验证,将经病理证实的病变作为“金标准”。
• 在不符合LI-RADS中确定性HCC(LR5)标准的非典型病理证实的HCC病变上训练的CNN可以正确区分HCC病变与其他肝脏恶性肿瘤,可能扩大基于影像的诊断在具有非典型特征的原发性肝癌中的作用。• 训练后的CNN总体准确率为87.3%,计算时间<3毫秒,为作为决策支持工具的临床应用铺平了道路。