• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过增加训练数据的可变性来提高深度学习模型在诊断病理学中的泛化能力:骨肉瘤亚型实验

Improving Generalization of Deep Learning Models for Diagnostic Pathology by Increasing Variability in Training Data: Experiments on Osteosarcoma Subtypes.

作者信息

Tang Haiming, Sun Nanfei, Shen Steven

机构信息

Department of Pathology and Laboratory Medicine, Yale New Haven Hospital, New Haven, Connecticut, USA.

Department of Management Information System, College of Business, University of Houston Clear Lake, Houston, Texas, USA.

出版信息

J Pathol Inform. 2021 Aug 4;12:30. doi: 10.4103/jpi.jpi_78_20. eCollection 2021.

DOI:10.4103/jpi.jpi_78_20
PMID:34497734
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8404558/
Abstract

BACKGROUND

Artificial intelligence has an emerging progress in diagnostic pathology. A large number of studies of applying deep learning models to histopathological images have been published in recent years. While many studies claim high accuracies, they may fall into the pitfalls of overfitting and lack of generalization due to the high variability of the histopathological images.

AIMS AND OBJECTS

Use the model training of osteosarcoma as an example to illustrate the pitfalls of overfitting and how the addition of model input variability can help improve model performance.

MATERIALS AND METHODS

We use the publicly available osteosarcoma dataset to retrain a previously published classification model for osteosarcoma. We partition the same set of images into the training and testing datasets differently than the original study: the test dataset consists of images from one patient while the training dataset consists images of all other patients. We also show the influence of training data variability on model performance by collecting a minimal dataset of 10 osteosarcoma subtypes as well as benign tissues and benign bone tumors of differentiation.

RESULTS

The performance of the re-trained model on the test set using the new partition schema declines dramatically, indicating a lack of model generalization and overfitting. We show the additions of more and moresubtypes into the training data step by step under the same model schema yield a series of coherent models with increasing performances.

CONCLUSIONS

In conclusion, we bring forward data preprocessing and collection tactics for histopathological images of high variability to avoid the pitfalls of overfitting and build deep learning models of higher generalization abilities.

摘要

背景

人工智能在诊断病理学领域取得了新进展。近年来,发表了大量将深度学习模型应用于组织病理学图像的研究。虽然许多研究声称准确率很高,但由于组织病理学图像的高度变异性,它们可能会陷入过度拟合和缺乏泛化能力的陷阱。

目的

以骨肉瘤的模型训练为例,说明过度拟合的陷阱以及增加模型输入变异性如何有助于提高模型性能。

材料与方法

我们使用公开可用的骨肉瘤数据集对先前发表的骨肉瘤分类模型进行重新训练。我们将同一组图像划分为训练集和测试集的方式与原始研究不同:测试数据集由一名患者的图像组成,而训练数据集由所有其他患者的图像组成。我们还通过收集10种骨肉瘤亚型以及良性组织和良性骨肿瘤分化的最小数据集,展示了训练数据变异性对模型性能的影响。

结果

使用新的划分模式在测试集上重新训练的模型性能大幅下降,表明模型缺乏泛化能力且存在过度拟合。我们展示了在相同模型模式下,逐步向训练数据中添加越来越多的亚型会产生一系列性能不断提高的连贯模型。

结论

总之,我们提出了针对高变异性组织病理学图像的数据预处理和收集策略,以避免过度拟合的陷阱,并构建具有更高泛化能力的深度学习模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/916db7f82738/JPI-12-30-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/563a9ac7db16/JPI-12-30-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/1ff0e3231866/JPI-12-30-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/c84d611c2501/JPI-12-30-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/febd43ee08d3/JPI-12-30-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/8b010718c6a4/JPI-12-30-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/b9ccf8cb4aff/JPI-12-30-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/916db7f82738/JPI-12-30-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/563a9ac7db16/JPI-12-30-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/1ff0e3231866/JPI-12-30-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/c84d611c2501/JPI-12-30-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/febd43ee08d3/JPI-12-30-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/8b010718c6a4/JPI-12-30-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/b9ccf8cb4aff/JPI-12-30-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/927e/8404558/916db7f82738/JPI-12-30-g007.jpg

相似文献

1
Improving Generalization of Deep Learning Models for Diagnostic Pathology by Increasing Variability in Training Data: Experiments on Osteosarcoma Subtypes.通过增加训练数据的可变性来提高深度学习模型在诊断病理学中的泛化能力:骨肉瘤亚型实验
J Pathol Inform. 2021 Aug 4;12:30. doi: 10.4103/jpi.jpi_78_20. eCollection 2021.
2
A Multimodal Auxiliary Classification System for Osteosarcoma Histopathological Images Based on Deep Active Learning.基于深度主动学习的骨肉瘤组织病理学图像多模态辅助分类系统
Healthcare (Basel). 2022 Oct 31;10(11):2189. doi: 10.3390/healthcare10112189.
3
Limited Number of Cases May Yield Generalizable Models, a Proof of Concept in Deep Learning for Colon Histology.少量病例可能产生可推广的模型,这是深度学习用于结肠组织学的概念验证。
J Pathol Inform. 2020 Feb 21;11:5. doi: 10.4103/jpi.jpi_49_19. eCollection 2020.
4
Intelligent Segmentation Medical Assistance System for MRI Images of Osteosarcoma in Developing Countries.发展中国家骨肉瘤 MRI 图像的智能分割医学辅助系统。
Comput Math Methods Med. 2022 Jan 19;2022:7703583. doi: 10.1155/2022/7703583. eCollection 2022.
5
Synthesizing CT images from MR images with deep learning: model generalization for different datasets through transfer learning.深度学习合成 CT 图像从磁共振图像:通过迁移学习实现不同数据集的模型泛化。
Biomed Phys Eng Express. 2021 Feb 24;7(2). doi: 10.1088/2057-1976/abe3a7.
6
Deep Learning Approaches to Osteosarcoma Diagnosis and Classification: A Comparative Methodological Approach.骨肉瘤诊断与分类的深度学习方法:一种比较性的方法学途径
Cancers (Basel). 2023 Apr 13;15(8):2290. doi: 10.3390/cancers15082290.
7
Deep learning for colon cancer histopathological images analysis.用于结肠癌组织病理学图像分析的深度学习
Comput Biol Med. 2021 Sep;136:104730. doi: 10.1016/j.compbiomed.2021.104730. Epub 2021 Aug 4.
8
Deep model with Siamese network for viable and necrotic tumor regions assessment in osteosarcoma.用于骨肉瘤中存活和坏死肿瘤区域评估的带有连体网络的深度模型。
Med Phys. 2020 Oct;47(10):4895-4905. doi: 10.1002/mp.14397. Epub 2020 Aug 5.
9
Impact of the Volume and Distribution of Training Datasets in the Development of Deep-Learning Models for the Diagnosis of Colorectal Polyps in Endoscopy Images.训练数据集的数量和分布对用于诊断内镜图像中结直肠息肉的深度学习模型开发的影响
J Pers Med. 2022 Aug 24;12(9):1361. doi: 10.3390/jpm12091361.
10
StoHisNet: A hybrid multi-classification model with CNN and Transformer for gastric pathology images.StoHisNet:一种基于 CNN 和 Transformer 的混合多分类模型,用于胃病理图像。
Comput Methods Programs Biomed. 2022 Jun;221:106924. doi: 10.1016/j.cmpb.2022.106924. Epub 2022 May 29.

引用本文的文献

1
Multiparametric cellular and spatial organization in cancer tissue lesions with a streamlined pipeline.利用简化流程对癌症组织病变进行多参数细胞和空间组织分析
Nat Biomed Eng. 2025 Aug 25. doi: 10.1038/s41551-025-01475-9.
2
Radiographic Findings Associated With Mild Hip Dysplasia in 3869 Patients Using a Deep Learning Measurement Tool.使用深度学习测量工具对3869例轻度髋关节发育不良患者的影像学表现
Arthroplast Today. 2024 Jun 18;28:101398. doi: 10.1016/j.artd.2024.101398. eCollection 2024 Aug.
3
[The model transferability of AI in digital pathology : Potential and reality].

本文引用的文献

1
An artificial intelligence algorithm for prostate cancer diagnosis in whole slide images of core needle biopsies: a blinded clinical validation and deployment study.一种用于经皮穿刺活检全切片图像中前列腺癌诊断的人工智能算法:一项盲法临床验证与应用研究。
Lancet Digit Health. 2020 Aug;2(8):e407-e416. doi: 10.1016/S2589-7500(20)30159-X.
2
Accuracy and Efficiency of Deep-Learning-Based Automation of Dual Stain Cytology in Cervical Cancer Screening.深度学习自动化双重染色细胞学在宫颈癌筛查中的准确性和效率。
J Natl Cancer Inst. 2021 Jan 4;113(1):72-79. doi: 10.1093/jnci/djaa066.
3
Clinically Applicable AI System for Accurate Diagnosis, Quantitative Measurements, and Prognosis of COVID-19 Pneumonia Using Computed Tomography.
[人工智能在数字病理学中的模型可转移性:潜力与现实]
Pathologie (Heidelb). 2024 Mar;45(2):124-132. doi: 10.1007/s00292-024-01299-5. Epub 2024 Feb 19.
4
Seeing the random forest through the decision trees. Supporting learning health systems from histopathology with machine learning models: Challenges and opportunities.透过决策树审视随机森林。利用机器学习模型助力基于组织病理学的学习型健康系统:挑战与机遇。
J Pathol Inform. 2023 Nov 4;15:100347. doi: 10.1016/j.jpi.2023.100347. eCollection 2024 Dec.
5
Performance of externally validated machine learning models based on histopathology images for the diagnosis, classification, prognosis, or treatment outcome prediction in female breast cancer: A systematic review.基于组织病理学图像的外部验证机器学习模型在女性乳腺癌诊断、分类、预后或治疗结果预测中的性能:一项系统综述。
J Pathol Inform. 2023 Nov 5;15:100348. doi: 10.1016/j.jpi.2023.100348. eCollection 2024 Dec.
6
A Deep Learning Approach for Automatic and Objective Grading of the Motor Impairment Severity in Parkinson's Disease for Use in Tele-Assessments.一种深度学习方法,用于自动和客观地对帕金森病运动障碍严重程度进行分级,以便用于远程评估。
Sensors (Basel). 2023 Nov 6;23(21):9004. doi: 10.3390/s23219004.
7
Improved artificial intelligence discrimination of minor histological populations by supplementing with color-adjusted images.通过补充经过颜色调整的图像来提高人工智能对微小组织群体的区分能力。
Sci Rep. 2023 Nov 4;13(1):19068. doi: 10.1038/s41598-023-46472-7.
8
Unleashing the potential of AI for pathology: challenges and recommendations.释放人工智能在病理学中的潜力:挑战与建议。
J Pathol. 2023 Aug;260(5):564-577. doi: 10.1002/path.6168. Epub 2023 Aug 7.
9
Machine learning to predict overall short-term mortality in cutaneous melanoma.机器学习预测皮肤黑色素瘤的短期总体死亡率
Discov Oncol. 2023 Jan 31;14(1):13. doi: 10.1007/s12672-023-00622-5.
10
Design of a Honey Badger Optimization Algorithm with a Deep Transfer Learning-Based Osteosarcoma Classification Model.基于深度迁移学习的骨肉瘤分类模型的蜜獾优化算法设计
Cancers (Basel). 2022 Dec 9;14(24):6066. doi: 10.3390/cancers14246066.
利用计算机断层扫描技术对 COVID-19 肺炎进行准确诊断、定量测量和预后的临床适用人工智能系统。
Cell. 2020 Jun 11;181(6):1423-1433.e11. doi: 10.1016/j.cell.2020.04.045. Epub 2020 May 4.
4
Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study.利用活检进行前列腺癌 Gleason 分级的自动化深度学习系统:一项诊断研究。
Lancet Oncol. 2020 Feb;21(2):233-241. doi: 10.1016/S1470-2045(19)30739-9. Epub 2020 Jan 8.
5
Viable and necrotic tumor assessment from whole slide images of osteosarcoma using machine-learning and deep-learning models.使用机器学习和深度学习模型从骨肉瘤全切片图像评估肿瘤的活性和坏死。
PLoS One. 2019 Apr 17;14(4):e0210706. doi: 10.1371/journal.pone.0210706. eCollection 2019.
6
Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study.深度学习模型检测胸片肺炎的可变泛化性能:一项横断面研究。
PLoS Med. 2018 Nov 6;15(11):e1002683. doi: 10.1371/journal.pmed.1002683. eCollection 2018 Nov.
7
1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset.1399 例乳腺癌患者 H&E 染色前哨淋巴结切片:CAMELYON 数据集。
Gigascience. 2018 Jun 1;7(6). doi: 10.1093/gigascience/giy065.
8
Convolutional Neural Network for Histopathological Analysis of Osteosarcoma.用于骨肉瘤组织病理学分析的卷积神经网络
J Comput Biol. 2018 Mar;25(3):313-325. doi: 10.1089/cmb.2017.0153. Epub 2017 Oct 30.
9
Osteosarcoma Overview.骨肉瘤概述
Rheumatol Ther. 2017 Jun;4(1):25-43. doi: 10.1007/s40744-016-0050-2. Epub 2016 Dec 8.
10
Review of Osteosarcoma and Current Management.骨肉瘤综述与当前治疗方法
Rheumatol Ther. 2016 Dec;3(2):221-243. doi: 10.1007/s40744-016-0046-y. Epub 2016 Oct 19.