• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

肺炎和新冠肺炎患者公共X射线图像数据集的偏差分析

Bias Analysis on Public X-Ray Image Datasets of Pneumonia and COVID-19 Patients.

作者信息

Catala Omar Del Tejo, Igual Ismael Salvador, Perez-Benito Francisco Javier, Escriva David Millan, Castello Vicent Ortiz, Llobet Rafael, Perez-Cortes Juan-Carlos

机构信息

Instituto Tecnológico de Informática (ITI), Universitat Politècnica de València 46022 Valencia Spain.

Department of Computer Systems and Computation (DSIC)Universitat Politècnica de València 46022 Valencia Spain.

出版信息

IEEE Access. 2021 Mar 10;9:42370-42383. doi: 10.1109/ACCESS.2021.3065456. eCollection 2021.

DOI:10.1109/ACCESS.2021.3065456
PMID:34812384
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8545228/
Abstract

Chest X-ray images are useful for early COVID-19 diagnosis with the advantage that X-ray devices are already available in health centers and images are obtained immediately. Some datasets containing X-ray images with cases (pneumonia or COVID-19) and controls have been made available to develop machine-learning-based methods to aid in diagnosing the disease. However, these datasets are mainly composed of different sources coming from pre-COVID-19 datasets and COVID-19 datasets. Particularly, we have detected a significant bias in some of the released datasets used to train and test diagnostic systems, which might imply that the results published are optimistic and may overestimate the actual predictive capacity of the techniques proposed. In this article, we analyze the existing bias in some commonly used datasets and propose a series of preliminary steps to carry out before the classic machine learning pipeline in order to detect possible biases, to avoid them if possible and to report results that are more representative of the actual predictive power of the methods under analysis.

摘要

胸部X光图像对于早期诊断新冠肺炎很有用,其优势在于健康中心已有X光设备,且能立即获取图像。一些包含病例(肺炎或新冠肺炎)和对照的X光图像数据集已可供使用,以开发基于机器学习的方法来辅助疾病诊断。然而,这些数据集主要由来自新冠肺炎疫情前数据集和新冠肺炎数据集的不同来源组成。特别是,我们在一些用于训练和测试诊断系统的已发布数据集中检测到了显著偏差,这可能意味着所发表的结果较为乐观,可能高估了所提出技术的实际预测能力。在本文中,我们分析了一些常用数据集中存在的偏差,并提出了一系列在经典机器学习流程之前要采取的初步步骤,以便检测可能的偏差,尽可能避免这些偏差,并报告更能代表所分析方法实际预测能力的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/9767276cac93/salva13ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/66443d3ca012/salva1ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/6e5c1117f00f/salva2-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/265cd116f9be/salva3-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/c69013ed0d52/salva4-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/b50b69175f99/salva5-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/7d3a3fef4352/salva6ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/7a8b25f09f3a/salva7ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/67e3310e41fe/salva8-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/86dec039115c/salva9abc-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/bee1deb32573/salva10-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/9af5df27cd27/salva11ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/d32981cb13c9/salva12ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/9767276cac93/salva13ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/66443d3ca012/salva1ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/6e5c1117f00f/salva2-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/265cd116f9be/salva3-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/c69013ed0d52/salva4-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/b50b69175f99/salva5-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/7d3a3fef4352/salva6ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/7a8b25f09f3a/salva7ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/67e3310e41fe/salva8-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/86dec039115c/salva9abc-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/bee1deb32573/salva10-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/9af5df27cd27/salva11ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/d32981cb13c9/salva12ab-3065456.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbee/8545228/9767276cac93/salva13ab-3065456.jpg

相似文献

1
Bias Analysis on Public X-Ray Image Datasets of Pneumonia and COVID-19 Patients.肺炎和新冠肺炎患者公共X射线图像数据集的偏差分析
IEEE Access. 2021 Mar 10;9:42370-42383. doi: 10.1109/ACCESS.2021.3065456. eCollection 2021.
2
COVID-19 diagnosis from chest X-ray images using transfer learning: Enhanced performance by debiasing dataloader.基于迁移学习的胸部 X 光图像 COVID-19 诊断:通过去偏数据加载器提高性能。
J Xray Sci Technol. 2021;29(1):19-36. doi: 10.3233/XST-200757.
3
CovXNet: A multi-dilation convolutional neural network for automatic COVID-19 and other pneumonia detection from chest X-ray images with transferable multi-receptive feature optimization.CovXNet:一种多扩张卷积神经网络,用于从胸部 X 光图像中自动检测 COVID-19 和其他肺炎,具有可转移的多感受野特征优化。
Comput Biol Med. 2020 Jul;122:103869. doi: 10.1016/j.compbiomed.2020.103869. Epub 2020 Jun 20.
4
Automatic coronavirus disease 2019 diagnosis based on chest radiography and deep learning - Success story or dataset bias?基于胸部 X 光和深度学习的新型冠状病毒病 2019 自动诊断——成功案例还是数据集偏差?
Med Phys. 2022 Feb;49(2):978-987. doi: 10.1002/mp.15419. Epub 2022 Jan 12.
5
A Novel Approach to the Technique of Lung Region Segmentation Based on a Deep Learning Model to Diagnose COVID-19 X-ray Images.一种基于深度学习模型诊断COVID-19 X光图像的肺部区域分割技术的新方法。
Curr Med Imaging. 2024;20:1-11. doi: 10.2174/0115734056271185231121074341.
6
Convolutional neural network model based on radiological images to support COVID-19 diagnosis: Evaluating database biases.基于放射影像的卷积神经网络模型支持 COVID-19 诊断:评估数据库偏差。
PLoS One. 2021 Mar 1;16(3):e0247839. doi: 10.1371/journal.pone.0247839. eCollection 2021.
7
Deep Learning on Chest X-ray Images to Detect and Evaluate Pneumonia Cases at the Era of COVID-19.在新冠疫情时代利用胸部X光图像进行深度学习以检测和评估肺炎病例
J Med Syst. 2021 Jun 8;45(7):75. doi: 10.1007/s10916-021-01745-4.
8
Fast and Accurate Detection of COVID-19 Along With 14 Other Chest Pathologies Using a Multi-Level Classification: Algorithm Development and Validation Study.使用多级分类快速准确地检测 COVID-19 以及其他 14 种胸部病症:算法开发和验证研究。
J Med Internet Res. 2021 Feb 10;23(2):e23693. doi: 10.2196/23693.
9
UBNet: Deep learning-based approach for automatic X-ray image detection of pneumonia and COVID-19 patients.UBNet:基于深度学习的方法,用于自动检测 X 射线图像中的肺炎和 COVID-19 患者。
J Xray Sci Technol. 2022;30(1):57-71. doi: 10.3233/XST-211005.
10
ConvCoroNet: a deep convolutional neural network optimized with iterative thresholding algorithm for Covid-19 detection using chest X-ray images.ConvCoroNet:一种利用迭代阈值算法优化的深度卷积神经网络,用于使用胸部 X 光图像检测新冠病毒。
J Biomol Struct Dyn. 2024 Jul;42(11):5699-5712. doi: 10.1080/07391102.2023.2227726. Epub 2023 Jun 24.

引用本文的文献

1
MIDAS: a technology-enabled hub-and-spoke system for the collection and dissemination of high-quality medical datasets in India.MIDAS:一种在印度用于收集和传播高质量医学数据集的技术支持的中心辐射式系统。
BMC Med Inform Decis Mak. 2025 Jul 6;25(1):252. doi: 10.1186/s12911-025-03092-7.
2
Challenges issues and future recommendations of deep learning techniques for SARS-CoV-2 detection utilising X-ray and CT images: a comprehensive review.利用X射线和CT图像进行SARS-CoV-2检测的深度学习技术面临的挑战、问题及未来建议:全面综述
PeerJ Comput Sci. 2024 Dec 24;10:e2517. doi: 10.7717/peerj-cs.2517. eCollection 2024.
3

本文引用的文献

1
COVID-19 detection and disease progression visualization: Deep learning on chest X-rays for classification and coarse localization.新冠病毒疾病(COVID-19)检测与疾病进展可视化:基于胸部X光的深度学习用于分类和粗略定位。
Appl Intell (Dordr). 2021;51(2):1010-1021. doi: 10.1007/s10489-020-01867-1. Epub 2020 Sep 12.
2
CovidGAN: Data Augmentation Using Auxiliary Classifier GAN for Improved Covid-19 Detection.CovidGAN:使用辅助分类器生成对抗网络进行数据增强以改进新冠病毒检测
IEEE Access. 2020 May 14;8:91916-91923. doi: 10.1109/ACCESS.2020.2994762. eCollection 2020.
3
Automatic Detection of Coronavirus Disease (COVID-19) in X-ray and CT Images: A Machine Learning Based Approach.
Deep Learning for Pneumonia Detection in Chest X-ray Images: A Comprehensive Survey.
胸部X光图像中肺炎检测的深度学习:全面综述。
J Imaging. 2024 Jul 23;10(8):176. doi: 10.3390/jimaging10080176.
4
Digital Determinants of Health: Health data poverty amplifies existing health disparities-A scoping review.健康的数字决定因素:健康数据贫困加剧了现有的健康差距——一项范围综述。
PLOS Digit Health. 2023 Oct 12;2(10):e0000313. doi: 10.1371/journal.pdig.0000313. eCollection 2023 Oct.
5
Validating Automatic Concept-Based Explanations for AI-Based Digital Histopathology.验证基于人工智能的数字病理学中基于概念的自动解释。
Sensors (Basel). 2022 Jul 18;22(14):5346. doi: 10.3390/s22145346.
6
Explainable artificial intelligence-based edge fuzzy images for COVID-19 detection and identification.基于可解释人工智能的边缘模糊图像用于新冠病毒病的检测与识别
Appl Soft Comput. 2022 Jul;123:108966. doi: 10.1016/j.asoc.2022.108966. Epub 2022 May 13.
7
Explainable Artificial Intelligence for Bias Detection in COVID CT-Scan Classifiers.用于 COVID CT 扫描分类器中偏差检测的可解释人工智能。
Sensors (Basel). 2021 Aug 23;21(16):5657. doi: 10.3390/s21165657.
基于机器学习方法的X射线和CT图像中新型冠状病毒肺炎(COVID-19)的自动检测
Biocybern Biomed Eng. 2021 Jul-Sep;41(3):867-879. doi: 10.1016/j.bbe.2021.05.013. Epub 2021 Jun 5.
4
A critic evaluation of methods for COVID-19 automatic detection from X-ray images.对从X射线图像中自动检测COVID-19的方法的批判性评估。
Inf Fusion. 2021 Dec;76:1-7. doi: 10.1016/j.inffus.2021.04.008. Epub 2021 Apr 30.
5
A light CNN for detecting COVID-19 from CT scans of the chest.一种用于从胸部CT扫描中检测新冠肺炎的轻量级卷积神经网络。
Pattern Recognit Lett. 2020 Dec;140:95-100. doi: 10.1016/j.patrec.2020.10.001. Epub 2020 Oct 3.
6
Potential limitations in COVID-19 machine learning due to data source variability: A case study in the nCov2019 dataset.由于数据源的变异性,COVID-19 机器学习可能存在的局限性:nCov2019 数据集案例研究。
J Am Med Inform Assoc. 2021 Feb 15;28(2):360-364. doi: 10.1093/jamia/ocaa258.
7
The investigation of multiresolution approaches for chest X-ray image based COVID-19 detection.基于胸部X光图像的COVID-19检测的多分辨率方法研究。
Health Inf Sci Syst. 2020 Sep 29;8(1):29. doi: 10.1007/s13755-020-00116-6. eCollection 2020 Dec.
8
Deep learning approaches for COVID-19 detection based on chest X-ray images.基于胸部X光图像的新冠肺炎检测深度学习方法
Expert Syst Appl. 2021 Feb;164:114054. doi: 10.1016/j.eswa.2020.114054. Epub 2020 Sep 28.
9
Unveiling COVID-19 from CHEST X-Ray with Deep Learning: A Hurdles Race with Small Data.利用深度学习揭示 CHEST X-RAY 中的 COVID-19:小数据的障碍竞赛。
Int J Environ Res Public Health. 2020 Sep 22;17(18):6933. doi: 10.3390/ijerph17186933.
10
Early diagnosis of COVID-19-affected patients based on X-ray and computed tomography images using deep learning algorithm.使用深度学习算法基于X射线和计算机断层扫描图像对感染新冠病毒的患者进行早期诊断。
Soft comput. 2023;27(5):2635-2643. doi: 10.1007/s00500-020-05275-y. Epub 2020 Aug 28.