Zhong Jingyu, Liu Xianwei, Lu Junjie, Yang Jiarui, Zhang Guangcheng, Mao Shiqi, Chen Haoda, Yin Qian, Cen Qingqing, Jiang Run, Song Yang, Lu Minda, Chu Jingshen, Xing Yue, Hu Yangfan, Ding Defang, Ge Xiang, Zhang Huan, Yao Weiwu
Laboratory of Key Technology and Materials in Minimally Invasive Spine Surgery, Tongren Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
Center for Spinal Minimally Invasive Research, Shanghai Jiao Tong University, Shanghai, China.
Eur Radiol. 2025 Mar;35(3):1146-1156. doi: 10.1007/s00330-024-11331-0. Epub 2025 Jan 9.
To investigate how studies determine the sample size when developing radiomics prediction models for binary outcomes, and whether the sample size meets the estimates obtained by using established criteria.
We identified radiomics studies that were published from 01 January 2023 to 31 December 2023 in seven leading peer-reviewed radiological journals. We reviewed the sample size justification methods, and actual sample size used. We calculated and compared the actual sample size used to the estimates obtained by using three established criteria proposed by Riley et al. We investigated which characteristics factors were associated with the sufficient sample size that meets the estimates obtained by using established criteria proposed by Riley et al. RESULTS: We included 116 studies. Eleven out of one hundred sixteen studies justified the sample size, in which 6/11 performed a priori sample size calculation. The median (first and third quartile, Q1, Q3) of the total sample size is 223 (130, 463), and those of sample size for training are 150 (90, 288). The median (Q1, Q3) difference between total sample size and minimum sample size according to established criteria are -100 (-216, 183), and those differences between total sample size and a more restrictive approach based on established criteria are -268 (-427, -157). The presence of external testing and the specialty of the topic were associated with sufficient sample size.
Radiomics studies are often designed without sample size justification, whose sample size may be too small to avoid overfitting. Sample size justification is encouraged when developing a radiomics model.
Question Sample size justification is critical to help minimize overfitting in developing a radiomics model, but is overlooked and underpowered in radiomics research. Findings Few of the radiomics models justified, calculated, or reported their sample size, and most of them did not meet the recent formal sample size criteria. Clinical relevance Radiomics models are often designed without sample size justification. Consequently, many models are too small to avoid overfitting. It should be encouraged to justify, perform, and report the considerations on sample size when developing radiomics models.
探讨在开发用于二元结局的放射组学预测模型时,研究如何确定样本量,以及样本量是否符合使用既定标准获得的估计值。
我们检索了2023年1月1日至2023年12月31日在七本领先的同行评审放射学杂志上发表的放射组学研究。我们回顾了样本量合理性证明方法以及实际使用的样本量。我们计算并比较了实际使用的样本量与使用Riley等人提出的三个既定标准获得的估计值。我们研究了哪些特征因素与符合Riley等人提出的既定标准获得的估计值的充足样本量相关。结果:我们纳入了116项研究。116项研究中有11项对样本量进行了合理性证明,其中6/11进行了先验样本量计算。总样本量的中位数(第一和第三四分位数,Q1,Q3)为223(130,463),训练样本量的中位数为150(90,288)。根据既定标准,总样本量与最小样本量之间的中位数(Q1,Q3)差异为-100(-216,183),总样本量与基于既定标准的更严格方法之间的差异为-268(-427,-157)。外部测试的存在和主题的专业性与充足的样本量相关。
放射组学研究通常在没有样本量合理性证明的情况下进行设计,其样本量可能过小,无法避免过拟合。在开发放射组学模型时,鼓励进行样本量合理性证明。
问题样本量合理性证明对于在开发放射组学模型时帮助最小化过拟合至关重要,但在放射组学研究中被忽视且力度不足。发现很少有放射组学模型对其样本量进行合理性证明、计算或报告,并且大多数模型不符合最近的正式样本量标准。临床意义放射组学模型通常在没有样本量合理性证明的情况下进行设计。因此,许多模型过小,无法避免过拟合。在开发放射组学模型时,应鼓励对样本量的考虑进行合理性证明、执行和报告。