Suppr超能文献

减轻放射学机器学习中的偏差:1. 数据处理。

Mitigating Bias in Radiology Machine Learning: 1. Data Handling.

作者信息

Rouzrokh Pouria, Khosravi Bardia, Faghani Shahriar, Moassefi Mana, Vera Garcia Diana V, Singh Yashbir, Zhang Kuan, Conte Gian Marco, Erickson Bradley J

机构信息

Radiology Informatics Laboratory, Department of Radiology, Mayo Clinic, 200 1st St SW, Rochester, MN 55905.

出版信息

Radiol Artif Intell. 2022 Aug 24;4(5):e210290. doi: 10.1148/ryai.210290. eCollection 2022 Sep.

Abstract

Minimizing bias is critical to adoption and implementation of machine learning (ML) in clinical practice. Systematic mathematical biases produce consistent and reproducible differences between the observed and expected performance of ML systems, resulting in suboptimal performance. Such biases can be traced back to various phases of ML development: data handling, model development, and performance evaluation. This report presents 12 suboptimal practices during data handling of an ML study, explains how those practices can lead to biases, and describes what may be done to mitigate them. Authors employ an arbitrary and simplified framework that splits ML data handling into four steps: data collection, data investigation, data splitting, and feature engineering. Examples from the available research literature are provided. A Google Colaboratory Jupyter notebook includes code examples to demonstrate the suboptimal practices and steps to prevent them. Data Handling, Bias, Machine Learning, Deep Learning, Convolutional Neural Network (CNN), Computer-aided Diagnosis (CAD) © RSNA, 2022.

摘要

在临床实践中,尽量减少偏差对于机器学习(ML)的采用和实施至关重要。系统性数学偏差会在ML系统的观察性能和预期性能之间产生一致且可重复的差异,从而导致性能次优。此类偏差可追溯到ML开发的各个阶段:数据处理、模型开发和性能评估。本报告介绍了ML研究数据处理过程中的12种次优做法,解释了这些做法如何导致偏差,并描述了减轻偏差的措施。作者采用了一个任意且简化的框架,将ML数据处理分为四个步骤:数据收集、数据调查、数据拆分和特征工程。提供了现有研究文献中的示例。一个Google Colaboratory Jupyter笔记本包含代码示例,以演示次优做法及预防措施。数据处理、偏差、机器学习、深度学习、卷积神经网络(CNN)、计算机辅助诊断(CAD)©RSNA,2022年。

相似文献

1
Mitigating Bias in Radiology Machine Learning: 1. Data Handling.减轻放射学机器学习中的偏差:1. 数据处理。
Radiol Artif Intell. 2022 Aug 24;4(5):e210290. doi: 10.1148/ryai.210290. eCollection 2022 Sep.
2
Mitigating Bias in Radiology Machine Learning: 3. Performance Metrics.减轻放射学机器学习中的偏差:3. 性能指标。
Radiol Artif Intell. 2022 Aug 24;4(5):e220061. doi: 10.1148/ryai.220061. eCollection 2022 Sep.
3
Mitigating Bias in Radiology Machine Learning: 2. Model Development.减轻放射学机器学习中的偏差:2. 模型开发。
Radiol Artif Intell. 2022 Aug 24;4(5):e220010. doi: 10.1148/ryai.220010. eCollection 2022 Sep.

引用本文的文献

本文引用的文献

7

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验