• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

组织病理学数据集中潜在偏差因素的调查。

Investigation on potential bias factors in histopathology datasets.

作者信息

Kheiri Farnaz, Rahnamayan Shahryar, Makrehchi Masoud, Asilian Bidgoli Azam

机构信息

Department of Electrical, Computer and Software Engineering, Ontario Tech University, Oshawa, Canada.

Department of Engineering, Brock University, St. Catharines, Canada.

出版信息

Sci Rep. 2025 Apr 2;15(1):11349. doi: 10.1038/s41598-025-89210-x.

DOI:10.1038/s41598-025-89210-x
PMID:40175463
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11965531/
Abstract

Deep neural networks (DNNs) have demonstrated remarkable capabilities in medical applications, including digital pathology, where they excel at analyzing complex patterns in medical images to assist in accurate disease diagnosis and prognosis. However, concerns have arisen about potential biases in The Cancer Genome Atlas (TCGA) dataset, a comprehensive repository of digitized histopathology data and serves as both a training and validation source for deep learning models, suggesting that over-optimistic results of model performance may be due to reliance on biased features rather than histological characteristics. Surprisingly, recent studies have confirmed the existence of site-specific bias in the embedded features extracted for cancer-type discrimination, leading to high accuracy in acquisition site classification. This biased behavior motivated us to conduct an in-depth analysis to investigate potential causes behind this unexpected biased ability toward site-specific pattern recognition. The analysis was conducted on two cutting-edge DNN models: KimiaNet, a state-of-the-art DNN trained on TCGA images, and the self-trained EfficientNet. In this research study, the balanced accuracy metric is used to evaluate the performance of a model trained to classify data centers, which was originally designed to learn cancerous patterns, with the aim of investigating the potential factors contributing to the higher balanced accuracy in data center detection.

摘要

深度神经网络(DNN)在医学应用中展现出了卓越的能力,包括数字病理学领域,在该领域中,深度神经网络擅长分析医学图像中的复杂模式,以协助进行准确的疾病诊断和预后评估。然而,人们对癌症基因组图谱(TCGA)数据集的潜在偏差产生了担忧,该数据集是一个数字化组织病理学数据的综合存储库,同时作为深度学习模型的训练和验证来源,这表明模型性能的过度乐观结果可能是由于依赖有偏差的特征而非组织学特征。令人惊讶的是,最近的研究证实了在为癌症类型区分而提取的嵌入特征中存在特定部位偏差,从而导致采集部位分类的高精度。这种有偏差的行为促使我们进行深入分析,以探究这种对特定部位模式识别的意外偏差能力背后的潜在原因。该分析是在两个前沿的DNN模型上进行的:KimiaNet,一个在TCGA图像上训练的先进DNN,以及自训练的EfficientNet。在这项研究中,平衡准确率指标用于评估训练用于对数据中心进行分类的模型的性能,该模型最初旨在学习癌性模式,目的是探究导致数据中心检测中更高平衡准确率的潜在因素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/7c73d2ce61b4/41598_2025_89210_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/fd3846167d1e/41598_2025_89210_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/c2e57171fe1a/41598_2025_89210_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/cea144065506/41598_2025_89210_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/5ecc10587195/41598_2025_89210_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/f3abfbc3c42a/41598_2025_89210_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/3014eaf74c73/41598_2025_89210_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/7c73d2ce61b4/41598_2025_89210_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/fd3846167d1e/41598_2025_89210_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/c2e57171fe1a/41598_2025_89210_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/cea144065506/41598_2025_89210_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/5ecc10587195/41598_2025_89210_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/f3abfbc3c42a/41598_2025_89210_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/3014eaf74c73/41598_2025_89210_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07d0/11965531/7c73d2ce61b4/41598_2025_89210_Fig7_HTML.jpg

相似文献

1
Investigation on potential bias factors in histopathology datasets.组织病理学数据集中潜在偏差因素的调查。
Sci Rep. 2025 Apr 2;15(1):11349. doi: 10.1038/s41598-025-89210-x.
2
Biased data, biased AI: deep networks predict the acquisition site of TCGA images.有偏数据,有偏 AI:深度网络预测 TCGA 图像的采集部位。
Diagn Pathol. 2023 May 17;18(1):67. doi: 10.1186/s13000-023-01355-3.
3
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像(MRI)中进行脑肿瘤分割与检测
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
4
Bias reduction in representation of histopathology images using deep feature selection.使用深度特征选择减少组织病理学图像表示中的偏差。
Sci Rep. 2022 Nov 21;12(1):19994. doi: 10.1038/s41598-022-24317-z.
5
Fine-Tuning and training of densenet for histopathology image representation using TCGA diagnostic slides.使用 TCGA 诊断幻灯片对 densenet 进行微调及训练,以用于组织病理学图像表示。
Med Image Anal. 2021 May;70:102032. doi: 10.1016/j.media.2021.102032. Epub 2021 Mar 10.
6
Semi-supervised training of deep convolutional neural networks with heterogeneous data and few local annotations: An experiment on prostate histopathology image classification.基于异构数据和少量局部标注的深度卷积神经网络的半监督学习:前列腺组织病理学图像分类实验。
Med Image Anal. 2021 Oct;73:102165. doi: 10.1016/j.media.2021.102165. Epub 2021 Jul 14.
7
Perception without preconception: comparison between the human and machine learner in recognition of tissues from histological sections.无预设认知的感知:在识别组织切片方面,人与机器学习者的比较。
Sci Rep. 2022 Sep 30;12(1):16420. doi: 10.1038/s41598-022-20012-1.
8
MABAL: a Novel Deep-Learning Architecture for Machine-Assisted Bone Age Labeling.MABAL:一种用于机器辅助骨龄标注的新型深度学习架构。
J Digit Imaging. 2018 Aug;31(4):513-519. doi: 10.1007/s10278-018-0053-3.
9
Large scale tissue histopathology image classification, segmentation, and visualization via deep convolutional activation features.通过深度卷积激活特征进行大规模组织病理图像分类、分割和可视化
BMC Bioinformatics. 2017 May 26;18(1):281. doi: 10.1186/s12859-017-1685-x.
10
LiverNet: efficient and robust deep learning model for automatic diagnosis of sub-types of liver hepatocellular carcinoma cancer from H&E stained liver histopathology images.LiverNet:一种高效、稳健的深度学习模型,用于从 H&E 染色的肝脏组织病理学图像中自动诊断肝肝细胞癌亚型。
Int J Comput Assist Radiol Surg. 2021 Sep;16(9):1549-1563. doi: 10.1007/s11548-021-02410-4. Epub 2021 May 30.

引用本文的文献

1
PixCell: A generative foundation model for digital histopathology images.PixCell:一种用于数字组织病理学图像的生成基础模型。
ArXiv. 2025 Jun 5:arXiv:2506.05127v1.

本文引用的文献

1
Demographic bias in misdiagnosis by computational pathology models.计算病理学模型导致的误诊中的人口统计学偏差。
Nat Med. 2024 Apr;30(4):1174-1190. doi: 10.1038/s41591-024-02885-z. Epub 2024 Apr 19.
2
Detecting shortcut learning for fair medical AI using shortcut testing.使用捷径测试检测公平医疗 AI 的捷径学习。
Nat Commun. 2023 Jul 18;14(1):4314. doi: 10.1038/s41467-023-39902-7.
3
Biased data, biased AI: deep networks predict the acquisition site of TCGA images.有偏数据,有偏 AI:深度网络预测 TCGA 图像的采集部位。
Diagn Pathol. 2023 May 17;18(1):67. doi: 10.1186/s13000-023-01355-3.
4
Improved classification of colorectal polyps on histopathological images with ensemble learning and stain normalization.基于集成学习和染色归一化的组织病理学图像中结直肠息肉的分类改进。
Comput Methods Programs Biomed. 2023 Apr;232:107441. doi: 10.1016/j.cmpb.2023.107441. Epub 2023 Feb 24.
5
Biases associated with database structure for COVID-19 detection in X-ray images.X 射线图像中用于 COVID-19 检测的数据库结构所带来的偏倚。
Sci Rep. 2023 Mar 1;13(1):3477. doi: 10.1038/s41598-023-30174-1.
6
Generalizability of Machine Learning Models: Quantitative Evaluation of Three Methodological Pitfalls.机器学习模型的可推广性:三种方法陷阱的定量评估
Radiol Artif Intell. 2022 Nov 16;5(1):e220028. doi: 10.1148/ryai.220028. eCollection 2023 Jan.
7
Non-producer multiple myeloma presenting with acute hyperammonemic encephalopathy: case report.非生产性多发性骨髓瘤伴发急性高氨血症性脑病:病例报告。
Diagn Pathol. 2023 Jan 4;18(1):1. doi: 10.1186/s13000-022-01285-6.
8
Bias reduction in representation of histopathology images using deep feature selection.使用深度特征选择减少组织病理学图像表示中的偏差。
Sci Rep. 2022 Nov 21;12(1):19994. doi: 10.1038/s41598-022-24317-z.
9
Addressing fairness in artificial intelligence for medical imaging.解决医学影像人工智能中的公平性问题。
Nat Commun. 2022 Aug 6;13(1):4581. doi: 10.1038/s41467-022-32186-3.
10
The Potential For Bias In Machine Learning And Opportunities For Health Insurers To Address It.机器学习中的潜在偏差及其被健康保险公司利用的机会。
Health Aff (Millwood). 2022 Feb;41(2):212-218. doi: 10.1377/hlthaff.2021.01287.