Suppr超能文献

物理成像参数变化导致域转移。

Physical imaging parameter variation drives domain shift.

机构信息

Department of Physics of Complex Systems, Eötvös Loránd University, Budapest, Hungary.

Health Services Management Training Centre, Semmelweis University, Budapest, Hungary.

出版信息

Sci Rep. 2022 Dec 9;12(1):21302. doi: 10.1038/s41598-022-23990-4.

Abstract

Statistical learning algorithms strongly rely on an oversimplified assumption for optimal performance, that is, source (training) and target (testing) data are independent and identically distributed. Variation in human tissue, physician labeling and physical imaging parameters (PIPs) in the generative process, yield medical image datasets with statistics that render this central assumption false. When deploying models, new examples are often out of distribution with respect to training data, thus, training robust dependable and predictive models is still a challenge in medical imaging with significant accuracy drops common for deployed models. This statistical variation between training and testing data is referred to as domain shift (DS).To the best of our knowledge we provide the first empirical evidence that variation in PIPs between test and train medical image datasets is a significant driver of DS and model generalization error is correlated with this variance. We show significant covariate shift occurs due to a selection bias in sampling from a small area of PIP space for both inter and intra-hospital regimes. In order to show this, we control for population shift, prevalence shift, data selection biases and annotation biases to investigate the sole effect of the physical generation process on model generalization for a proxy task of age group estimation on a combined 44 k image mammogram dataset collected from five hospitals.We hypothesize that training data should be sampled evenly from PIP space to produce the most robust models and hope this study provides motivation to retain medical image generation metadata that is almost always discarded or redacted in open source datasets. This metadata measured with standard international units can provide a universal regularizing anchor between distributions generated across the world for all current and future imaging modalities.

摘要

统计学习算法强烈依赖于一个过于简化的假设,即源(训练)和目标(测试)数据是独立同分布的。在生成过程中,人类组织、医生标记和物理成像参数(PIP)的变化导致医学图像数据集的统计数据呈现出这种中心假设是错误的。在部署模型时,新的示例通常与训练数据分布不同,因此,训练稳健、可靠和可预测的模型仍然是医学成像中的一个挑战,部署的模型通常会出现显著的准确性下降。这种训练数据和测试数据之间的统计差异被称为领域转移(Domain Shift,DS)。据我们所知,我们首次提供了经验证据,证明 PIP 测试和训练医学图像数据集之间的变化是 DS 的一个重要驱动因素,并且模型泛化误差与这种方差相关。我们表明,由于在医院内和医院间的规则中,从 PIP 空间的一个小区域中进行采样会产生选择偏差,因此会发生显著的协变量转移。为了证明这一点,我们控制了人群转移、患病率转移、数据选择偏差和注释偏差,以调查仅物理生成过程对模型泛化的影响,这是在从五个医院收集的一个包含 44k 张图像的乳腺 X 光数据集上进行的年龄组估计代理任务。我们假设训练数据应该从 PIP 空间中均匀采样,以生成最稳健的模型,并希望本研究提供动力,保留几乎总是在开源数据集中丢弃或编辑的医学图像生成元数据。用标准国际单位测量的这种元数据可以为全球范围内生成的分布之间提供一个通用的正则化锚点,适用于所有当前和未来的成像模式。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32b0/9734181/7b2baf5363b8/41598_2022_23990_Fig1_HTML.jpg

相似文献

1
Physical imaging parameter variation drives domain shift.
Sci Rep. 2022 Dec 9;12(1):21302. doi: 10.1038/s41598-022-23990-4.
2
S-CUDA: Self-cleansing unsupervised domain adaptation for medical image segmentation.
Med Image Anal. 2021 Dec;74:102214. doi: 10.1016/j.media.2021.102214. Epub 2021 Aug 12.
4
A real use case of semi-supervised learning for mammogram classification in a local clinic of Costa Rica.
Med Biol Eng Comput. 2022 Apr;60(4):1159-1175. doi: 10.1007/s11517-021-02497-6. Epub 2022 Mar 3.
5
RIL-Contour: a Medical Imaging Dataset Annotation Tool for and with Deep Learning.
J Digit Imaging. 2019 Aug;32(4):571-581. doi: 10.1007/s10278-019-00232-0.
6
Domain generalization on medical imaging classification using episodic training with task augmentation.
Comput Biol Med. 2022 Feb;141:105144. doi: 10.1016/j.compbiomed.2021.105144. Epub 2021 Dec 24.
7
ROBUST WHITE MATTER HYPERINTENSITY SEGMENTATION ON UNSEEN DOMAIN.
Proc IEEE Int Symp Biomed Imaging. 2021 Apr;2021:1047-1051. doi: 10.1109/ISBI48211.2021.9434034. Epub 2021 May 25.
8
CyCMIS: Cycle-consistent Cross-domain Medical Image Segmentation via diverse image augmentation.
Med Image Anal. 2022 Feb;76:102328. doi: 10.1016/j.media.2021.102328. Epub 2021 Dec 8.
9
Nanotechnology: an evidence-based analysis.
Ont Health Technol Assess Ser. 2006;6(19):1-43. Epub 2006 Nov 1.

引用本文的文献

1
Advancing standards in biomedical image analysis validation: A perspective on Metrics Reloaded.
Clin Transl Med. 2025 Sep;15(9):e70237. doi: 10.1002/ctm2.70237.
4
Towards generalizable Federated Learning in medical imaging: A real-world case study on mammography data.
Comput Struct Biotechnol J. 2025 Mar 20;28:106-117. doi: 10.1016/j.csbj.2025.03.031. eCollection 2025.
6
Annotated dataset for deep-learning-based bacterial colony detection.
Sci Data. 2023 Jul 28;10(1):497. doi: 10.1038/s41597-023-02404-8.

本文引用的文献

1
Domain Generalization: A Survey.
IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4396-4415. doi: 10.1109/TPAMI.2022.3195549. Epub 2023 Mar 7.
2
Domain Adaptation for Medical Image Analysis: A Survey.
IEEE Trans Biomed Eng. 2022 Mar;69(3):1173-1185. doi: 10.1109/TBME.2021.3117407. Epub 2022 Feb 18.
3
The Effect of Image Resolution on Deep Learning in Radiography.
Radiol Artif Intell. 2020 Jan 22;2(1):e190015. doi: 10.1148/ryai.2019190015. eCollection 2020 Jan.
4
One Algorithm May Not Fit All: How Selection Bias Affects Machine Learning Performance.
Radiographics. 2020 Nov-Dec;40(7):1932-1937. doi: 10.1148/rg.2020200040. Epub 2020 Sep 25.
5
Causality matters in medical imaging.
Nat Commun. 2020 Jul 22;11(1):3673. doi: 10.1038/s41467-020-17478-w.
6
Generalizing Deep Learning for Medical Image Segmentation to Unseen Domains via Deep Stacked Transformation.
IEEE Trans Med Imaging. 2020 Jul;39(7):2531-2540. doi: 10.1109/TMI.2020.2973595. Epub 2020 Feb 12.
7
Breast cancer.
Nat Rev Dis Primers. 2019 Sep 23;5(1):66. doi: 10.1038/s41572-019-0111-2.
8
Breast Microcalcification Diagnosis Using Deep Convolutional Neural Network from Digital Mammograms.
Comput Math Methods Med. 2019 Mar 3;2019:2717454. doi: 10.1155/2019/2717454. eCollection 2019.
9
Correction of beam hardening in X-ray radiograms.
Rev Sci Instrum. 2019 Feb;90(2):025108. doi: 10.1063/1.5080540.
10
Detecting and classifying lesions in mammograms with Deep Learning.
Sci Rep. 2018 Mar 15;8(1):4165. doi: 10.1038/s41598-018-22437-z.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验