Institute of Clinical Sciences, Danderyd University Hospital, Karolinska Institute, Sweden.
Department of Computer and System Sciences, Stockholm University, Sweden.
Acta Orthop. 2021 Oct;92(5):513-525. doi: 10.1080/17453674.2021.1918389. Epub 2021 May 14.
Background and purpose - Artificial intelligence (AI), deep learning (DL), and machine learning (ML) have become common research fields in orthopedics and medicine in general. Engineers perform much of the work. While they gear the results towards healthcare professionals, the difference in competencies and goals creates challenges for collaboration and knowledge exchange. We aim to provide clinicians with a context and understanding of AI research by facilitating communication between creators, researchers, clinicians, and readers of medical AI and ML research.Methods and results - We present the common tasks, considerations, and pitfalls (both methodological and ethical) that clinicians will encounter in AI research. We discuss the following topics: labeling, missing data, training, testing, and overfitting. Common performance and outcome measures for various AI and ML tasks are presented, including accuracy, precision, recall, F1 score, Dice score, the area under the curve, and ROC curves. We also discuss ethical considerations in terms of privacy, fairness, autonomy, safety, responsibility, and liability regarding data collecting or sharing.Interpretation - We have developed guidelines for reporting medical AI research to clinicians in the run-up to a broader consensus process. The proposed guidelines consist of a Clinical Artificial Intelligence Research (CAIR) checklist and specific performance metrics guidelines to present and evaluate research using AI components. Researchers, engineers, clinicians, and other stakeholders can use these proposal guidelines and the CAIR checklist to read, present, and evaluate AI research geared towards a healthcare setting.
背景与目的-人工智能(AI)、深度学习(DL)和机器学习(ML)已经成为骨科和一般医学领域的常见研究领域。工程师完成了大部分工作。虽然他们将研究结果针对医疗保健专业人员,但能力和目标的差异为合作和知识交流带来了挑战。我们旨在通过促进医学 AI 和 ML 研究的创作者、研究人员、临床医生和读者之间的沟通,为临床医生提供对 AI 研究的背景和理解。
方法与结果-我们介绍了临床医生在 AI 研究中会遇到的常见任务、注意事项和陷阱(包括方法和道德方面)。我们讨论了以下主题:标记、缺失数据、训练、测试和过拟合。我们还介绍了各种 AI 和 ML 任务的常见性能和结果衡量标准,包括准确性、精度、召回率、F1 分数、Dice 分数、曲线下面积和 ROC 曲线。我们还讨论了数据收集或共享方面的隐私、公平、自主、安全、责任和责任等道德方面的考虑因素。
解释-我们已经制定了向临床医生报告医学 AI 研究的指南,以促进更广泛的共识过程。拟议的指南包括临床人工智能研究(CAIR)检查表和特定的性能指标指南,用于展示和评估使用 AI 组件的研究。研究人员、工程师、临床医生和其他利益相关者可以使用这些提案指南和 CAIR 检查表来阅读、呈现和评估面向医疗保健环境的 AI 研究。
Early Hum Dev. 2020-11
BMJ Health Care Inform. 2021-8
Ther Adv Urol. 2025-8-2
Pediatr Res. 2025-7-24
J Pharm Bioallied Sci. 2025-5
Drug Discov Today. 2021-1
NPJ Digit Med. 2020-3-23