Subramanian Harry, Dey Rahul, Brim Waverly Rose, Tillmanns Niklas, Cassinelli Petersen Gabriel, Brackett Alexandria, Mahajan Amit, Johnson Michele, Malhotra Ajay, Aboian Mariam
Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, United States.
Harvey Cushing/John Hay Whitney Medical Library, Yale School of Medicine, New Haven, CT, United States.
Front Oncol. 2021 Dec 23;11:788819. doi: 10.3389/fonc.2021.788819. eCollection 2021.
Machine learning has been applied to the diagnostic imaging of gliomas to augment classification, prognostication, segmentation, and treatment planning. A systematic literature review was performed to identify how machine learning has been applied to identify gliomas in datasets which include non-glioma images thereby simulating normal clinical practice.
Four databases were searched by a medical librarian and confirmed by a second librarian for all articles published prior to February 1, 2021: Ovid Embase, Ovid MEDLINE, Cochrane trials (CENTRAL), and Web of Science-Core Collection. The search strategy included both keywords and controlled vocabulary combining the terms for: artificial intelligence, machine learning, deep learning, radiomics, magnetic resonance imaging, glioma, as well as related terms. The review was conducted in stepwise fashion with abstract screening, full text screening, and data extraction. Quality of reporting was assessed using TRIPOD criteria.
A total of 11,727 candidate articles were identified, of which 12 articles were included in the final analysis. Studies investigated the differentiation of normal from abnormal images in datasets which include gliomas (7 articles) and the differentiation of glioma images from non-glioma or normal images (5 articles). Single institution datasets were most common (5 articles) followed by BRATS (3 articles). The median sample size was 280 patients. Algorithm testing strategies consisted of five-fold cross validation (5 articles), and the use of exclusive sets of images within the same dataset for training and for testing (7 articles). Neural networks were the most common type of algorithm (10 articles). The accuracy of algorithms ranged from 0.75 to 1.00 (median 0.96, 10 articles). Quality of reporting assessment utilizing TRIPOD criteria yielded a mean individual TRIPOD ratio of 0.50 (standard deviation 0.14, range 0.37 to 0.85).
Systematic review investigating the identification of gliomas in datasets which include non-glioma images demonstrated multiple limitations hindering the application of these algorithms to clinical practice. These included limited datasets, a lack of generalizable algorithm training and testing strategies, and poor quality of reporting. The development of more robust and heterogeneous datasets is needed for algorithm development. Future studies would benefit from using external datasets for algorithm testing as well as placing increased attention on quality of reporting standards.
www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42020209938, International Prospective Register of Systematic Reviews (PROSPERO 2020 CRD42020209938).
机器学习已应用于胶质瘤的诊断成像,以加强分类、预后评估、分割和治疗规划。进行了一项系统文献综述,以确定机器学习在包含非胶质瘤图像的数据集(从而模拟正常临床实践)中如何用于识别胶质瘤。
由医学图书馆员检索四个数据库,并由另一位图书馆员确认截至2021年2月1日发表的所有文章:Ovid Embase、Ovid MEDLINE、Cochrane试验(CENTRAL)和科学网核心合集。检索策略包括关键词和控制词汇,结合了以下术语:人工智能、机器学习、深度学习、放射组学、磁共振成像、胶质瘤以及相关术语。该综述以逐步方式进行,包括摘要筛选、全文筛选和数据提取。使用TRIPOD标准评估报告质量。
共识别出11727篇候选文章,其中12篇文章纳入最终分析。研究调查了包含胶质瘤的数据集(7篇文章)中正常图像与异常图像的区分,以及胶质瘤图像与非胶质瘤或正常图像的区分(5篇文章)。单机构数据集最为常见(5篇文章),其次是BRATS数据集(3篇文章)。样本量中位数为280例患者。算法测试策略包括五折交叉验证(5篇文章),以及在同一数据集中使用专用图像集进行训练和测试(7篇文章)。神经网络是最常见的算法类型(10篇文章)。算法的准确率范围为0.75至1.00(中位数0.96,10篇文章)。利用TRIPOD标准进行的报告质量评估得出的平均个体TRIPOD比率为0.50(标准差0.14,范围0.37至0.85)。
对包含非胶质瘤图像的数据集进行胶质瘤识别的系统综述表明,存在多个限制因素阻碍了这些算法在临床实践中的应用。这些因素包括数据集有限、缺乏可推广的算法训练和测试策略以及报告质量差。算法开发需要更强大和异质性的数据集。未来的研究将受益于使用外部数据集进行算法测试,并更加关注报告标准的质量。
www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42020209938,国际系统综述前瞻性注册库(PROSPERO 2020 CRD42020209938)。