Qin Xiang, Huang Lisheng, Wei Yuanfeng, Li Hongxiang, Wu Yuting, Zhong Jingmeng, Jian Mingjue, Zhang Jing, Zheng Zeyu, Xu Yikai, Yan Chenggong
Department of Medical Imaging Center, Nanfang Hospital, Southern Medical University, Guangzhou North Avenue No.1838, 510515, Guangzhou, China.
Abdom Radiol (NY). 2025 Sep 16. doi: 10.1007/s00261-025-05202-5.
The Liver Imaging Reporting and Data System (LI-RADS) assessment is subject to inter-reader variability. The present study aimed to evaluate the impact of an artificial intelligence (AI) system on the accuracy and inter-reader agreement of LI-RADS classification based on contrast-enhanced magnetic resonance imaging among radiologists with varying experience levels.
This single-center, multi-reader, multi-case retrospective study included 120 patients with 200 focal liver lesions who underwent abdominal contrast-enhanced magnetic resonance imaging examinations between June 2023 and May 2024. Five radiologists with different experience levels independently assessed LI-RADS classification and imaging features with and without AI assistance. The reference standard was established by consensus between two expert radiologists. Accuracy was used to measure the performance of AI systems and radiologists. Kappa or intraclass correlation coefficient was utilized to estimate inter-reader agreement.
The LI-RADS categories were as follows: 33.5% of LR-3 (67/200), 29.0% of LR-4 (58/200), 33.5% of LR-5 (67/200), and 4.0% of LR-M (8/200) cases. The AI system significantly improved the overall accuracy of LI-RADS classification from 69.9 to 80.1% (p < 0.001), with the most notable improvement among junior radiologists from 65.7 to 79.7% (p < 0.001). Inter-reader agreement for LI-RADS classification was significantly higher with AI assistance compared to that without (weighted Cohen's kappa, 0.655 vs. 0.812, p < 0.001). The AI system also enhanced the accuracy and inter-reader agreement for imaging features, including non-rim arterial phase hyperenhancement, non-peripheral washout, and restricted diffusion. Additionally, inter-reader agreement for lesion size measurements improved, with intraclass correlation coefficient changing from 0.857 to 0.951 (p < 0.001).
The AI system significantly increases accuracy and inter-reader agreement of LI-RADS 3/4/5/M classification, particularly benefiting junior radiologists.
肝脏影像报告和数据系统(LI-RADS)评估存在阅片者间的差异。本研究旨在评估人工智能(AI)系统对不同经验水平的放射科医生基于对比增强磁共振成像的LI-RADS分类准确性和阅片者间一致性的影响。
这项单中心、多阅片者、多病例回顾性研究纳入了120例患有200个肝脏局灶性病变的患者,这些患者在2023年6月至2024年5月期间接受了腹部对比增强磁共振成像检查。五名经验水平不同的放射科医生在有无AI辅助的情况下独立评估LI-RADS分类和影像特征。参考标准由两名放射科专家共同商定。准确性用于衡量AI系统和放射科医生的表现。kappa值或组内相关系数用于估计阅片者间的一致性。
LI-RADS分类如下:LR-3(67/200)占33.5%,LR-4(58/200)占29.0%,LR-5(67/200)占33.5%,LR-M(8/200)占4.0%。AI系统显著提高了LI-RADS分类的总体准确性,从69.9%提高到80.1%(p<0.001),其中初级放射科医生的提高最为显著,从65.7%提高到79.7%(p<0.001)。与无AI辅助相比,有AI辅助时LI-RADS分类的阅片者间一致性显著更高(加权Cohen's kappa值,0.655对0.812,p<0.001)。AI系统还提高了影像特征的准确性和阅片者间一致性,包括非边缘动脉期高增强、非周边廓清和扩散受限。此外,病变大小测量的阅片者间一致性有所改善,组内相关系数从0.857变为0.951(p<0.001)。
AI系统显著提高了LI-RADS 3/4/5/M分类的准确性和阅片者间一致性,尤其使初级放射科医生受益。