Suppr超能文献

在放射组学中使用交叉验证时测量特征选择错误应用的偏差。

Measuring the bias of incorrect application of feature selection when using cross-validation in radiomics.

作者信息

Demircioğlu Aydin

机构信息

Institute of Diagnostic and Interventional Radiology and Neuroradiology, University Hospital Essen, Hufelandstr. 55, 45147, Essen, Germany.

出版信息

Insights Imaging. 2021 Nov 24;12(1):172. doi: 10.1186/s13244-021-01115-1.

Abstract

BACKGROUND

Many studies in radiomics are using feature selection methods to identify the most predictive features. At the same time, they employ cross-validation to estimate the performance of the developed models. However, if the feature selection is performed before the cross-validation, data leakage can occur, and the results can be biased. To measure the extent of this bias, we collected ten publicly available radiomics datasets and conducted two experiments. First, the models were developed by incorrectly applying the feature selection prior to cross-validation. Then, the same experiment was conducted by applying feature selection correctly within cross-validation to each fold. The resulting models were then evaluated against each other in terms of AUC-ROC, AUC-F1, and Accuracy.

RESULTS

Applying the feature selection incorrectly prior to the cross-validation showed a bias of up to 0.15 in AUC-ROC, 0.29 in AUC-F1, and 0.17 in Accuracy.

CONCLUSIONS

Incorrect application of feature selection and cross-validation can lead to highly biased results for radiomic datasets.

摘要

背景

许多放射组学研究使用特征选择方法来识别最具预测性的特征。同时,他们采用交叉验证来评估所开发模型的性能。然而,如果在交叉验证之前进行特征选择,可能会发生数据泄露,结果可能会有偏差。为了衡量这种偏差的程度,我们收集了十个公开可用的放射组学数据集并进行了两项实验。首先,通过在交叉验证之前错误地应用特征选择来开发模型。然后,通过在交叉验证内对每一折正确应用特征选择来进行相同的实验。然后根据AUC-ROC、AUC-F1和准确性对所得模型进行相互评估。

结果

在交叉验证之前错误地应用特征选择,在AUC-ROC中显示出高达0.15的偏差,在AUC-F1中显示出0.29的偏差,在准确性中显示出0.17的偏差。

结论

特征选择和交叉验证的错误应用可能会导致放射组学数据集的结果出现高度偏差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a1fe/8613324/f57c957629fe/13244_2021_1115_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验