Noguchi Tomoyuki, Matsushita Yumi, Kawata Yusuke, Shida Yoshitaka, Machitori Akihiro
Education and Training Office, Department of Clinical Research, Centre for Clinical Sciences, Japan.
Department of Radiology, National Hospital Organization Kyushu Medical Centre, Jigyohama, Chuo-ku, Fukuoka City, Fukuoka Province, Japan.
Pol J Radiol. 2021 Sep 13;86:e532-e541. doi: 10.5114/pjr.2021.110309. eCollection 2021.
Increased use of deep learning (DL) in medical imaging diagnoses has led to more frequent use of 10-fold cross-validation (10-CV) for the evaluation of the performance of DL. To eliminate some of the (10-fold) repetitive processing in 10-CV, we proposed a "generalized fitting method in conjunction with every possible coalition of N-combinations (G-EPOC)", to estimate the range of the mean accuracy of 10-CV using less than 10 results of 10-CV.
G-EPOC was executed as follows. We first provided (2N-1) coalition subsets using a specified N, which was 9 or less, out of 10 result datasets of 10-CV. We then obtained the estimation range of the accuracy by applying those subsets to the distribution fitting twice using a combination of normal, binominal, or Poisson distributions. Using datasets of 10-CVs acquired from the practical detection task of the appendicitis on CT by DL, we scored the estimation success rates if the range provided by G-EPOC included the true accuracy.
G-EPOC successfully estimated the range of the mean accuracy by 10-CV at over 95% rates for datasets with N assigned as 2 to 9.
G-EPOC will help lessen the consumption of time and computer resources in the development of computerbased diagnoses in medical imaging and could become an option for the selection of a reasonable K value in K-CV.
深度学习(DL)在医学影像诊断中的应用日益增加,这使得10倍交叉验证(10-CV)在评估DL性能时的使用更加频繁。为了消除10-CV中一些(10倍)的重复处理,我们提出了一种“结合N组合的每个可能联合的广义拟合方法(G-EPOC)”,以使用少于10个10-CV的结果来估计10-CV平均准确率的范围。
G-EPOC按以下方式执行。我们首先从10-CV的10个结果数据集中使用指定的N(N为9或更小)提供(2N - 1)个联合子集。然后,我们通过使用正态分布、二项分布或泊松分布的组合将这些子集应用于分布拟合两次,从而获得准确率的估计范围。使用从DL对CT上阑尾炎的实际检测任务中获取的10-CV数据集,如果G-EPOC提供的范围包含真实准确率,我们对估计成功率进行评分。
对于N设为2至9的数据集,G-EPOC以超过95%的比率成功估计了10-CV的平均准确率范围。
G-EPOC将有助于减少医学影像中基于计算机诊断开发过程中的时间和计算机资源消耗,并且可能成为在K折交叉验证(K-CV)中选择合理K值的一个选项。