自动化检测低质量数据：医疗保健领域的案例研究。

Automated detection of poor-quality data: case studies in healthcare.

机构信息

Presagen, Adelaide, SA, 5000, Australia.

School of Mathematical Sciences, The University of Adelaide, Adelaide, SA, 5000, Australia.

出版信息

Sci Rep. 2021 Sep 9;11(1):18005. doi: 10.1038/s41598-021-97341-0.

DOI:10.1038/s41598-021-97341-0

PMID:34504205

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8429593/

Abstract

The detection and removal of poor-quality data in a training set is crucial to achieve high-performing AI models. In healthcare, data can be inherently poor-quality due to uncertainty or subjectivity, but as is often the case, the requirement for data privacy restricts AI practitioners from accessing raw training data, meaning manual visual verification of private patient data is not possible. Here we describe a novel method for automated identification of poor-quality data, called Untrainable Data Cleansing. This method is shown to have numerous benefits including protection of private patient data; improvement in AI generalizability; reduction in time, cost, and data needed for training; all while offering a truer reporting of AI performance itself. Additionally, results show that Untrainable Data Cleansing could be useful as a triage tool to identify difficult clinical cases that may warrant in-depth evaluation or additional testing to support a diagnosis.

摘要

在训练集中检测和去除低质量数据对于实现高性能的 AI 模型至关重要。在医疗保健领域，由于不确定性或主观性，数据可能天生就低质量，但通常情况下，对数据隐私的要求限制了 AI 从业者访问原始训练数据，这意味着无法对私人患者数据进行手动视觉验证。在这里，我们描述了一种名为“不可训练数据清理”的自动识别低质量数据的新方法。该方法具有许多优点，包括保护私人患者数据；提高 AI 的泛化能力；减少训练所需的时间、成本和数据；同时更真实地报告 AI 本身的性能。此外，结果表明，不可训练数据清理可用作一种分诊工具，以识别可能需要深入评估或额外测试以支持诊断的困难临床病例。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/830b/8429593/8367f36fe47c/41598_2021_97341_Fig1_HTML.jpg

相似文献

Automated detection of poor-quality data: case studies in healthcare.自动化检测低质量数据：医疗保健领域的案例研究。

Sci Rep. 2021 Sep 9;11(1):18005. doi: 10.1038/s41598-021-97341-0.

The future of Cochrane Neonatal.考克兰新生儿协作网的未来。

Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.

The development an artificial intelligence algorithm for early sepsis diagnosis in the intensive care unit.开发一种用于重症监护病房早期脓毒症诊断的人工智能算法。

Int J Med Inform. 2020 Sep;141:104176. doi: 10.1016/j.ijmedinf.2020.104176. Epub 2020 May 21.

Comparison of Chest Radiograph Interpretations by Artificial Intelligence Algorithm vs Radiology Residents.人工智能算法与放射科住院医师对胸部 X 线片解读的比较。

JAMA Netw Open. 2020 Oct 1;3(10):e2022779. doi: 10.1001/jamanetworkopen.2020.22779.

Health Information Management: Implications of Artificial Intelligence on Healthcare Data and Information Management.健康信息管理：人工智能对医疗保健数据与信息管理的影响

Yearb Med Inform. 2019 Aug;28(1):56-64. doi: 10.1055/s-0039-1677913. Epub 2019 Aug 16.

The measurement and monitoring of surgical adverse events.手术不良事件的测量与监测

Health Technol Assess. 2001;5(22):1-194. doi: 10.3310/hta5220.

Privacy and artificial intelligence: challenges for protecting health information in a new era.隐私与人工智能：新时代保护健康信息的挑战。

BMC Med Ethics. 2021 Sep 15;22(1):122. doi: 10.1186/s12910-021-00687-3.

Mosaic decomposition: an electronic cleansing method for inhomogeneously tagged regions in noncathartic CT colonography.马赛克分解：非泻剂 CT 结肠成像中不均匀标记区域的电子清洗方法。

IEEE Trans Med Imaging. 2011 Mar;30(3):559-74. doi: 10.1109/TMI.2010.2087389. Epub 2010 Oct 14.

Automated Detection of HONcode Website Conformity Compared to Manual Detection: An Evaluation.与人工检测相比，HONcode网站合规性的自动检测：一项评估

J Med Internet Res. 2015 Jun 2;17(6):e135. doi: 10.2196/jmir.3831.

Artificial Intelligence: A New Paradigm in Obstetrics and Gynecology Research and Clinical Practice.人工智能：妇产科学研究与临床实践的新范式

Cureus. 2020 Feb 28;12(2):e7124. doi: 10.7759/cureus.7124.

引用本文的文献

Application of artificial intelligence in the diagnosis of hepatocellular carcinoma.人工智能在肝细胞癌诊断中的应用。

eGastroenterology. 2023 Nov 30;1(2):e100002. doi: 10.1136/egastro-2023-100002. eCollection 2023 Sep.

Internal validation of a convolutional neural network pipeline for assessing meibomian gland structure from meibography.用于从睑板腺造影评估睑板腺结构的卷积神经网络流程的内部验证

Optom Vis Sci. 2025 Jan 1;102(1):28-36. doi: 10.1097/OPX.0000000000002208. Epub 2025 Jan 13.

A Clinician's Guide to Sharing Data for AI in Ophthalmology.眼科人工智能数据共享临床医师指南

Invest Ophthalmol Vis Sci. 2024 Jun 3;65(6):21. doi: 10.1167/iovs.65.6.21.

DREAMER: a computational framework to evaluate readiness of datasets for machine learning.DREAMER：一个用于评估数据集是否适用于机器学习的计算框架。

BMC Med Inform Decis Mak. 2024 Jun 4;24(1):152. doi: 10.1186/s12911-024-02544-w.

Efficient automated error detection in medical data using deep-learning and label-clustering.使用深度学习和标签聚类技术实现医学数据的高效自动化错误检测。

Sci Rep. 2023 Nov 9;13(1):19587. doi: 10.1038/s41598-023-45946-y.

Proceedings of the first world conference on AI in fertility.第一届人工智能在生育领域世界大会会议记录

J Assist Reprod Genet. 2023 Feb;40(2):215-222. doi: 10.1007/s10815-022-02704-9. Epub 2023 Jan 4.

Moving towards vertically integrated artificial intelligence development.迈向垂直整合的人工智能发展。

NPJ Digit Med. 2022 Sep 15;5(1):143. doi: 10.1038/s41746-022-00690-x.

Development of an artificial intelligence model for predicting the likelihood of human embryo euploidy based on blastocyst images from multiple imaging systems during IVF.基于体外受精过程中多个成像系统的囊胚图像，开发一种人工智能模型，用于预测人类胚胎整倍体的可能性。

Hum Reprod. 2022 Jul 30;37(8):1746-1759. doi: 10.1093/humrep/deac131.

A novel decentralized federated learning approach to train on globally distributed, poor quality, and protected private medical data.一种新颖的去中心化联邦学习方法，可用于在全球分布的、质量较差且受保护的私人医疗数据上进行训练。

Sci Rep. 2022 May 25;12(1):8888. doi: 10.1038/s41598-022-12833-x.

Towards a safe and efficient clinical implementation of machine learning in radiation oncology by exploring model interpretability, explainability and data-model dependency.通过探索模型的可解释性、可解释性和数据-模型依赖性，实现机器学习在放射肿瘤学中的安全高效临床应用。

Phys Med Biol. 2022 May 27;67(11). doi: 10.1088/1361-6560/ac678a.

本文引用的文献

Development of an artificial intelligence-based assessment model for prediction of embryo viability using static images captured by optical light microscopy during IVF.开发一种基于人工智能的评估模型，用于通过体外受精期间光学显微镜拍摄的静态图像预测胚胎活力。

Hum Reprod. 2020 Apr 28;35(4):770-784. doi: 10.1093/humrep/deaa013.

Deep learning in medical image analysis: A third eye for doctors.深度学习在医学图像分析中的应用：医生的“第三只眼”。

J Stomatol Oral Maxillofac Surg. 2019 Sep;120(4):279-288. doi: 10.1016/j.jormas.2019.06.002. Epub 2019 Jun 26.

Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning.基于深度学习的眼底图像心血管风险因素预测。

Nat Biomed Eng. 2018 Mar;2(3):158-164. doi: 10.1038/s41551-018-0195-0. Epub 2018 Feb 19.

A guide to deep learning in healthcare.深度学习在医疗保健中的应用指南。

Nat Med. 2019 Jan;25(1):24-29. doi: 10.1038/s41591-018-0316-z. Epub 2019 Jan 7.

An overview of deep learning in medical imaging focusing on MRI.深度学习在医学影像中的概述，重点是 MRI。

Z Med Phys. 2019 May;29(2):102-127. doi: 10.1016/j.zemedi.2018.11.002. Epub 2018 Dec 13.

Clinically applicable deep learning for diagnosis and referral in retinal disease.临床适用的深度学习在视网膜疾病的诊断和转诊中的应用。

Nat Med. 2018 Sep;24(9):1342-1350. doi: 10.1038/s41591-018-0107-6. Epub 2018 Aug 13.

Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists.人机大战：深度学习卷积神经网络与 58 位皮肤科医生诊断黑色素瘤皮肤镜图像的对比研究

Ann Oncol. 2018 Aug 1;29(8):1836-1842. doi: 10.1093/annonc/mdy166.

Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning.基于图像的深度学习识别医学诊断和可治疗疾病。

Cell. 2018 Feb 22;172(5):1122-1131.e9. doi: 10.1016/j.cell.2018.02.010.

A survey on deep learning in medical image analysis.深度学习在医学图像分析中的应用研究综述。

Med Image Anal. 2017 Dec;42:60-88. doi: 10.1016/j.media.2017.07.005. Epub 2017 Jul 26.

Dermatologist-level classification of skin cancer with deep neural networks.基于深度神经网络的皮肤癌皮肤科医生级分类。

Nature. 2017 Feb 2;542(7639):115-118. doi: 10.1038/nature21056. Epub 2017 Jan 25.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

自动化检测低质量数据：医疗保健领域的案例研究。

Automated detection of poor-quality data: case studies in healthcare.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献