Sushentsev Nikita, Moreira Da Silva Nadia, Yeung Michael, Barrett Tristan, Sala Evis, Roberts Michael, Rundo Leonardo
Department of Radiology, University of Cambridge School of Clinical Medicine, Addenbrooke's Hospital and University of Cambridge, Cambridge Biomedical Campus, Box 218, Cambridge, CB2 0QQ, UK.
Lucida Medical Ltd, Biomedical Innovation Hub, University of Cambridge, Cambridge, UK.
Insights Imaging. 2022 Mar 28;13(1):59. doi: 10.1186/s13244-022-01199-3.
We systematically reviewed the current literature evaluating the ability of fully-automated deep learning (DL) and semi-automated traditional machine learning (TML) MRI-based artificial intelligence (AI) methods to differentiate clinically significant prostate cancer (csPCa) from indolent PCa (iPCa) and benign conditions.
We performed a computerised bibliographic search of studies indexed in MEDLINE/PubMed, arXiv, medRxiv, and bioRxiv between 1 January 2016 and 31 July 2021. Two reviewers performed the title/abstract and full-text screening. The remaining papers were screened by four reviewers using the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) for DL studies and Radiomics Quality Score (RQS) for TML studies. Papers that fulfilled the pre-defined screening requirements underwent full CLAIM/RQS evaluation alongside the risk of bias assessment using QUADAS-2, both conducted by the same four reviewers. Standard measures of discrimination were extracted for the developed predictive models.
17/28 papers (five DL and twelve TML) passed the quality screening and were subject to a full CLAIM/RQS/QUADAS-2 assessment, which revealed a substantial study heterogeneity that precluded us from performing quantitative analysis as part of this review. The mean RQS of TML papers was 11/36, and a total of five papers had a high risk of bias. AUCs of DL and TML papers with low risk of bias ranged between 0.80-0.89 and 0.75-0.88, respectively.
We observed comparable performance of the two classes of AI methods and identified a number of common methodological limitations and biases that future studies will need to address to ensure the generalisability of the developed models.
我们系统回顾了当前的文献,评估基于磁共振成像(MRI)的全自动化深度学习(DL)和半自动化传统机器学习(TML)人工智能(AI)方法区分临床显著性前列腺癌(csPCa)与惰性前列腺癌(iPCa)及良性病变的能力。
我们对2016年1月1日至2021年7月31日期间在MEDLINE/PubMed、arXiv、medRxiv和bioRxiv上索引的研究进行了计算机化文献检索。两名评审员进行标题/摘要和全文筛选。其余论文由四名评审员使用医学影像人工智能检查表(CLAIM)对DL研究进行筛选,使用放射组学质量评分(RQS)对TML研究进行筛选。符合预定义筛选要求的论文由相同的四名评审员进行全面的CLAIM/RQS评估以及使用QUADAS-2进行偏倚风险评估。为开发的预测模型提取了标准的鉴别指标。
17/28篇论文(5篇DL和12篇TML)通过了质量筛选,并接受了全面的CLAIM/RQS/QUADAS-2评估,结果显示研究存在很大异质性,这使得我们无法在本综述中进行定量分析。TML论文的平均RQS为11/36,共有五篇论文存在高偏倚风险。偏倚风险低的DL和TML论文的曲线下面积(AUC)分别在0.80 - 0.89和0.75 - 0.88之间。
我们观察到这两类AI方法具有可比的性能,并确定了一些常见的方法学局限性和偏倚,未来的研究需要解决这些问题,以确保所开发模型的可推广性。