Younger Jarrod, Morris Emily, Arnold Nicholas, Athulathmudali Chanchala, Pinidiyapathirage Janani, MacAskill William
Griffith University, Gold Coast, QLD, Australia.
Metro South Health Hospital and Health Service, Brisbane, QLD, Australia.
Jpn J Radiol. 2025 Aug 18. doi: 10.1007/s11604-025-01853-y.
This systematic review aims to examine the literature of artificial intelligence (AI) algorithms in the diagnosis of hepatocellular carcinoma (HCC) among focal liver lesions compared to radiologists on multiphase CT images, focusing on performance metrics that include sensitivity and specificity as a minimum.
We searched Embase, PubMed and Web of Science for studies published from January 2018 to May 2024. Eligible studies evaluated AI algorithms for diagnosing HCC using multiphase CT, with radiologist interpretation as a comparator. The performance of AI models and radiologists was recorded using sensitivity and specificity from each study. TRIPOD + AI was used for quality appraisal and PROBAST was used to assess the risk of bias.
Seven studies out of the 3532 reviewed were included in the review. All seven studies analysed the performance of AI models and radiologists. Two studies additionally assessed performance with and without supplementary clinical information to assist the AI model in diagnosis. Three studies additionally evaluated the performance of radiologists with assistance of the AI algorithm in diagnosis. The AI algorithms demonstrated a sensitivity ranging from 63.0 to 98.6% and a specificity of 82.0-98.6%. In comparison, junior radiologists (with less than 10 years of experience) exhibited a sensitivity of 41.2-92.0% and a specificity of 72.2-100%, while senior radiologists (with more than 10 years of experience) achieved a sensitivity between 63.9% and 93.7% and a specificity ranging from 71.9 to 99.9%.
AI algorithms demonstrate adequate performance in the diagnosis of HCC from focal liver lesions on multiphase CT images. Across geographic settings, AI could help streamline workflows and improve access to timely diagnosis. However, thoughtful implementation strategies are still needed to mitigate bias and overreliance.
本系统评价旨在研究与放射科医生相比,人工智能(AI)算法在多期CT图像上对肝脏局灶性病变中肝细胞癌(HCC)的诊断文献,重点关注至少包括敏感性和特异性的性能指标。
我们在Embase、PubMed和Web of Science中检索了2018年1月至2024年5月发表的研究。符合条件的研究评估了使用多期CT诊断HCC的AI算法,并将放射科医生的解读作为对照。使用每项研究中的敏感性和特异性记录AI模型和放射科医生的表现。采用TRIPOD+AI进行质量评估,并用PROBAST评估偏倚风险。
在3532篇综述文献中,有7项研究纳入本评价。所有7项研究均分析了AI模型和放射科医生的表现。两项研究还评估了有无补充临床信息辅助AI模型诊断时的表现。三项研究还评估了在AI算法辅助下放射科医生的诊断表现。AI算法的敏感性范围为63.0%至98.6%,特异性为82.0%至98.6%。相比之下,初级放射科医生(经验少于10年)的敏感性为41.2%至92.0%,特异性为72.2%至100%,而高级放射科医生(经验超过10年)的敏感性在63.9%至93.7%之间,特异性范围为71.9%至99.9%。
AI算法在多期CT图像上对肝脏局灶性病变中HCC的诊断表现良好。在不同地理环境中,AI有助于简化工作流程并改善及时诊断的可及性。然而,仍需要深思熟虑的实施策略来减轻偏倚和过度依赖。