• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多模态视网膜图像自动标签清理的效率与安全性

Efficiency and safety of automated label cleaning on multimodal retinal images.

作者信息

Lin Tian, Wang Meng, Lin Aidi, Mai Xiaoting, Liang Huiyu, Tham Yih-Chung, Chen Haoyu

机构信息

Joint Shantou International Eye Center, Shantou University and the Chinese University of Hong Kong, Shantou, Guangdong, 515041, China.

Shantou University Medical College, Shantou, Guangdong, 515041, China.

出版信息

NPJ Digit Med. 2025 Jan 5;8(1):10. doi: 10.1038/s41746-024-01424-x.

DOI:10.1038/s41746-024-01424-x
PMID:39757295
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11701072/
Abstract

Label noise is a common and important issue that would affect the model's performance in artificial intelligence. This study assessed the effectiveness and potential risks of automated label cleaning using an open-source framework, Cleanlab, in multi-category datasets of fundus photography and optical coherence tomography, with intentionally introduced label noise ranging from 0 to 70%. After six cycles of automatic cleaning, significant improvements are achieved in label accuracies (3.4-62.9%) and dataset quality scores (DQS, 5.1-74.4%). The majority (86.6 to 97.5%) of label errors were accurately modified, with minimal missed (0.5-2.8%) or misclassified (0.4-10.6%). The classification accuracy of RETFound significantly improved by 0.3-52.9% when trained with the datasets after cleaning. We also developed a DQS-guided cleaning strategy to mitigate over-cleaning. Furthermore, external validation on EyePACS and APTOS-2019 datasets boosted label accuracy by 1.3 and 1.8%, respectively. This approach automates label correction, enhances dataset reliability, and strengthens model performance efficiently and safely.

摘要

标签噪声是人工智能中一个常见且重要的问题,会影响模型性能。本研究使用开源框架Cleanlab评估了在眼底摄影和光学相干断层扫描的多类别数据集中自动标签清理的有效性和潜在风险,其中故意引入了0%至70%的标签噪声。经过六个周期的自动清理后,标签准确率(提高了3.4%至62.9%)和数据集质量分数(DQS,提高了5.1%至74.4%)都有显著提高。大多数(86.6%至97.5%)的标签错误都得到了准确修正,漏判(0.5%至2.8%)或误判(0.4%至10.6%)极少。使用清理后的数据集进行训练时,RETFound的分类准确率显著提高了0.3%至52.9%。我们还开发了一种由DQS引导的清理策略,以减轻过度清理的问题。此外,在EyePACS和APTOS - 2019数据集上的外部验证分别将标签准确率提高了1.3%和1.8%。这种方法能自动进行标签校正,有效且安全地提高数据集的可靠性,并增强模型性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c4a6/11701072/1b5f6ce4ff2b/41746_2024_1424_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c4a6/11701072/d50cbb3c4b01/41746_2024_1424_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c4a6/11701072/a94e482ceb9e/41746_2024_1424_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c4a6/11701072/f50cb323c82a/41746_2024_1424_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c4a6/11701072/1b5f6ce4ff2b/41746_2024_1424_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c4a6/11701072/d50cbb3c4b01/41746_2024_1424_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c4a6/11701072/a94e482ceb9e/41746_2024_1424_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c4a6/11701072/f50cb323c82a/41746_2024_1424_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c4a6/11701072/1b5f6ce4ff2b/41746_2024_1424_Fig4_HTML.jpg

相似文献

1
Efficiency and safety of automated label cleaning on multimodal retinal images.多模态视网膜图像自动标签清理的效率与安全性
NPJ Digit Med. 2025 Jan 5;8(1):10. doi: 10.1038/s41746-024-01424-x.
2
Independent Evaluation of RETFound Foundation Model's Performance on Optic Nerve Analysis Using Fundus Photography.关于使用眼底摄影术对RETFound基础模型在视神经分析方面的性能进行独立评估
Ophthalmol Sci. 2025 Jan 28;5(3):100720. doi: 10.1016/j.xops.2025.100720. eCollection 2025 May-Jun.
3
Phenotyping Alfalfa ( L.) Root Structure Architecture via Integrating Confident Machine Learning with ResNet-18.通过将可靠的机器学习与ResNet-18相结合对紫花苜蓿(L.)根系结构进行表型分析
Plant Phenomics. 2024 Sep 11;6:0251. doi: 10.34133/plantphenomics.0251. eCollection 2024.
4
Multi-Fundus Diseases Classification Using Retinal Optical Coherence Tomography Images with Swin Transformer V2.基于Swin Transformer V2利用视网膜光学相干断层扫描图像进行多眼底疾病分类
J Imaging. 2023 Sep 29;9(10):203. doi: 10.3390/jimaging9100203.
5
The potential of artificial intelligence reading label system on the training of ophthalmologists in retinal diseases, a multicenter bimodal multi-disease study.人工智能阅读标签系统在视网膜疾病眼科医生培训中的潜力:一项多中心双模式多病种研究
BMC Med Educ. 2025 Apr 8;25(1):503. doi: 10.1186/s12909-025-07066-1.
6
Evaluating a Foundation Artificial Intelligence Model for Glaucoma Detection Using Color Fundus Photographs.使用彩色眼底照片评估用于青光眼检测的基础人工智能模型。
Ophthalmol Sci. 2024 Sep 14;5(1):100623. doi: 10.1016/j.xops.2024.100623. eCollection 2025 Jan-Feb.
7
[Research on multi-class orthodontic image recognition system based on deep learning network model].基于深度学习网络模型的多类别正畸图像识别系统研究
Zhonghua Kou Qiang Yi Xue Za Zhi. 2023 Jun 9;58(6):561-568. doi: 10.3760/cma.j.cn112144-20230305-00070.
8
Advancing Glaucoma Diagnosis: Employing Confidence-Calibrated Label Smoothing Loss for Model Calibration.青光眼诊断进展:采用置信度校准标签平滑损失进行模型校准
Ophthalmol Sci. 2024 Jun 22;4(6):100555. doi: 10.1016/j.xops.2024.100555. eCollection 2024 Nov-Dec.
9
A Lightweight Diabetic Retinopathy Detection Model Using a Deep-Learning Technique.一种使用深度学习技术的轻量级糖尿病视网膜病变检测模型。
Diagnostics (Basel). 2023 Oct 3;13(19):3120. doi: 10.3390/diagnostics13193120.
10
Automated deep learning design for medical image classification by health-care professionals with no coding experience: a feasibility study.无编码经验的医疗保健专业人员进行医学图像分类的自动化深度学习设计:一项可行性研究。
Lancet Digit Health. 2019 Sep;1(5):e232-e242. doi: 10.1016/S2589-7500(19)30108-6. Epub 2019 Sep 5.

本文引用的文献

1
Uncertainty-inspired open set learning for retinal anomaly identification.基于不确定性的视网膜异常识别的开集学习。
Nat Commun. 2023 Oct 24;14(1):6757. doi: 10.1038/s41467-023-42444-7.
2
A foundation model for generalizable disease detection from retinal images.基于视网膜图像的通用疾病检测的基础模型。
Nature. 2023 Oct;622(7981):156-163. doi: 10.1038/s41586-023-06555-x. Epub 2023 Sep 13.
3
Economic evaluation of combined population-based screening for multiple blindness-causing eye diseases in China: a cost-effectiveness analysis.
中国基于人群的多种致盲眼病联合筛查的经济性评价:成本效果分析。
Lancet Glob Health. 2023 Mar;11(3):e456-e465. doi: 10.1016/S2214-109X(22)00554-X. Epub 2023 Jan 23.
4
Bayesian statistics-guided label refurbishment mechanism: Mitigating label noise in medical image classification.贝叶斯统计引导的标签修复机制:减轻医学图像分类中的标签噪声。
Med Phys. 2022 Sep;49(9):5899-5913. doi: 10.1002/mp.15799. Epub 2022 Jun 22.
5
Artificial Intelligence for Screening of Multiple Retinal and Optic Nerve Diseases.人工智能在多种视网膜和视神经疾病筛查中的应用。
JAMA Netw Open. 2022 May 2;5(5):e229960. doi: 10.1001/jamanetworkopen.2022.9960.
6
Real-time diabetic retinopathy screening by deep learning in a multisite national screening programme: a prospective interventional cohort study.深度学习在多中心全国性筛查项目中实时筛查糖尿病视网膜病变:一项前瞻性干预性队列研究。
Lancet Digit Health. 2022 Apr;4(4):e235-e244. doi: 10.1016/S2589-7500(22)00017-6. Epub 2022 Mar 7.
7
Active label cleaning for improved dataset quality under resource constraints.在资源受限的情况下,通过主动标签清洗来提高数据集质量。
Nat Commun. 2022 Mar 4;13(1):1161. doi: 10.1038/s41467-022-28818-3.
8
Improving Medical Images Classification With Label Noise Using Dual-Uncertainty Estimation.利用双重不确定性估计改进带标签噪声的医学图像分类。
IEEE Trans Med Imaging. 2022 Jun;41(6):1533-1546. doi: 10.1109/TMI.2022.3141425. Epub 2022 Jun 1.
9
Automatic detection of 39 fundus diseases and conditions in retinal photographs using deep neural networks.使用深度神经网络自动检测视网膜照片中的 39 种眼底疾病和病变。
Nat Commun. 2021 Aug 10;12(1):4828. doi: 10.1038/s41467-021-25138-w.
10
Causes of blindness and vision impairment in 2020 and trends over 30 years, and prevalence of avoidable blindness in relation to VISION 2020: the Right to Sight: an analysis for the Global Burden of Disease Study.2020 年失明和视力障碍的原因及 30 多年来的趋势,以及与 VISION 2020:看见的权利相关的可避免盲的患病率:全球疾病负担研究的分析。
Lancet Glob Health. 2021 Feb;9(2):e144-e160. doi: 10.1016/S2214-109X(20)30489-7. Epub 2020 Dec 1.