Department of Nuclear Medicine, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany.
Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin.
Nuklearmedizin. 2023 Dec;62(6):361-369. doi: 10.1055/a-2198-0545. Epub 2023 Nov 23.
Despite a vast number of articles on radiomics and machine learning in positron emission tomography (PET) imaging, clinical applicability remains limited, partly owing to poor methodological quality. We therefore systematically investigated the methodology described in publications on radiomics and machine learning for PET-based outcome prediction.
A systematic search for original articles was run on PubMed. All articles were rated according to 17 criteria proposed by the authors. Criteria with >2 rating categories were binarized into "adequate" or "inadequate". The association between the number of "adequate" criteria per article and the date of publication was examined.
One hundred articles were identified (published between 07/2017 and 09/2023). The median proportion of articles per criterion that were rated "adequate" was 65% (range: 23-98%). Nineteen articles (19%) mentioned neither a test cohort nor cross-validation to separate training from testing. The median number of criteria with an "adequate" rating per article was 12.5 out of 17 (range, 4-17), and this did not increase with later dates of publication (Spearman's rho, 0.094; p = 0.35). In 22 articles (22%), less than half of the items were rated "adequate". Only 8% of articles published the source code, and 10% made the dataset openly available.
Among the articles investigated, methodological weaknesses have been identified, and the degree of compliance with recommendations on methodological quality and reporting shows potential for improvement. Better adherence to established guidelines could increase the clinical significance of radiomics and machine learning for PET-based outcome prediction and finally lead to the widespread use in routine clinical practice.
尽管关于正电子发射断层扫描(PET)成像的放射组学和机器学习的文章数量众多,但由于方法学质量较差,其临床应用仍然有限。因此,我们系统地调查了发表在基于 PET 的结果预测的放射组学和机器学习的出版物中描述的方法学。
在 PubMed 上进行了系统的文献检索。根据作者提出的 17 条标准对所有文章进行评分。对于有>2 个评分类别的标准,将其二值化为“充分”或“不充分”。检查文章中“充分”标准数量与出版日期之间的关系。
共确定了 100 篇文章(发表于 2017 年 7 月至 2023 年 9 月)。每个标准的文章比例中位数为 65%(范围:23%-98%)。19 篇文章(19%)既没有提到测试队列,也没有提到交叉验证来将训练与测试分开。每篇文章的“充分”评分标准中位数为 12.5 个(范围,4-17),并且这并没有随着出版日期的推移而增加(Spearman 的 rho,0.094;p=0.35)。在 22 篇文章(22%)中,不到一半的项目被评为“充分”。只有 8%的文章公布了源代码,10%的文章公开提供了数据集。
在所调查的文章中,已经确定了方法学上的弱点,并且遵守方法学质量和报告建议的程度显示出改进的潜力。更好地遵守既定指南可以提高基于 PET 的结果预测中放射组学和机器学习的临床意义,并最终导致其在常规临床实践中的广泛应用。