文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

骨科研究中人工智能实施实用指南,第6部分:如何评估人工智能研究的性能?

A practical guide to the implementation of AI in orthopaedic research, Part 6: How to evaluate the performance of AI research?

作者信息

Oettl Felix C, Pareek Ayoosh, Winkler Philipp W, Zsidai Bálint, Pruneski James A, Senorski Eric Hamrin, Kopf Sebastian, Ley Christophe, Herbst Elmar, Oeding Jacob F, Grassi Alberto, Hirschmann Michael T, Musahl Volker, Samuelsson Kristian, Tischer Thomas, Feldt Robert

机构信息

Hospital for Special Surgery New York New York USA.

Schulthess Klinik Zurich Switzerland.

出版信息

J Exp Orthop. 2024 May 31;11(3):e12039. doi: 10.1002/jeo2.12039. eCollection 2024 Jul.


DOI:10.1002/jeo2.12039
PMID:38826500
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11141501/
Abstract

UNLABELLED: Artificial intelligence's (AI) accelerating progress demands rigorous evaluation standards to ensure safe, effective integration into healthcare's high-stakes decisions. As AI increasingly enables prediction, analysis and judgement capabilities relevant to medicine, proper evaluation and interpretation are indispensable. Erroneous AI could endanger patients; thus, developing, validating and deploying medical AI demands adhering to strict, transparent standards centred on safety, ethics and responsible oversight. Core considerations include assessing performance on diverse real-world data, collaborating with domain experts, confirming model reliability and limitations, and advancing interpretability. Thoughtful selection of evaluation metrics suited to the clinical context along with testing on diverse data sets representing different populations improves generalisability. Partnering software engineers, data scientists and medical practitioners ground assessment in real needs. Journals must uphold reporting standards matching AI's societal impacts. With rigorous, holistic evaluation frameworks, AI can progress towards expanding healthcare access and quality. LEVEL OF EVIDENCE: Level V.

摘要

未标注:人工智能(AI)的加速发展需要严格的评估标准,以确保其安全、有效地融入医疗保健领域的高风险决策中。随着人工智能越来越多地具备与医学相关的预测、分析和判断能力,正确的评估和解读必不可少。错误的人工智能可能会危及患者;因此,开发、验证和部署医疗人工智能需要遵循以安全、伦理和负责任的监督为核心的严格、透明的标准。核心考量包括评估在各种真实世界数据上的性能、与领域专家合作、确认模型的可靠性和局限性,以及提高可解释性。精心选择适合临床背景的评估指标,并在代表不同人群的各种数据集上进行测试,可提高通用性。让软件工程师、数据科学家和医学从业者合作,能使评估基于实际需求。期刊必须坚持与人工智能的社会影响相匹配的报告标准。有了严格、全面的评估框架,人工智能就能朝着扩大医疗保健的可及性和质量的方向发展。 证据级别:V级。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb90/11141501/7052bdc12287/JEO2-11-e12039-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb90/11141501/7052bdc12287/JEO2-11-e12039-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb90/11141501/7052bdc12287/JEO2-11-e12039-g001.jpg

相似文献

[1]
A practical guide to the implementation of AI in orthopaedic research, Part 6: How to evaluate the performance of AI research?

J Exp Orthop. 2024-5-31

[2]
AI for IMPACTS Framework for Evaluating the Long-Term Real-World Impacts of AI-Powered Clinician Tools: Systematic Review and Narrative Synthesis.

J Med Internet Res. 2025-2-5

[3]
A practical guide to the implementation of AI in orthopaedic research-Part 7: Risks, limitations, safety and verification of medical AI systems.

J Exp Orthop. 2025-4-24

[4]
Enhancing education for children with ASD: a review of evaluation and measurement in AI tool implementation.

Disabil Rehabil Assist Technol. 2025-3-13

[5]
Generative AI in healthcare: an implementation science informed translational path on application, integration and governance.

Implement Sci. 2024-3-15

[6]
Challenges and opportunities for validation of AI-based new approach methods.

ALTEX. 2025

[7]
Bridging the Gap: From AI Success in Clinical Trials to Real-World Healthcare Implementation-A Narrative Review.

Healthcare (Basel). 2025-3-22

[8]
Role of artificial intelligence, machine learning and deep learning models in corneal disorders - A narrative review.

J Fr Ophtalmol. 2024-9

[9]
Ethical Artificial Intelligence in Nursing Workforce Management and Policymaking: Bridging Philosophy and Practice.

J Nurs Manag. 2025-4-8

[10]
Artificial intelligence in hospital infection prevention: an integrative review.

Front Public Health. 2025-4-2

引用本文的文献

[1]
Development of Explainable Machine Learning Models to Predict Outcomes After Platelet-Rich Plasma Injections for Knee Osteoarthritis.

Orthop J Sports Med. 2025-8-7

[2]
Bioethical Considerations of Deploying Artificial Intelligence in Clinical Orthopedic Settings: A Narrative Review.

HSS J. 2025-5-30

[3]
Artificial intelligence-assisted analysis of musculoskeletal imaging-A narrative review of the current state of machine learning models.

Knee Surg Sports Traumatol Arthrosc. 2025-8

[4]
Artificial Intelligence and Musculoskeletal Surgical Applications.

HSS J. 2025-5-20

[5]
Artificial intelligence and the diagnosis of oral cavity cancer and oral potentially malignant disorders from clinical photographs: a narrative review.

Front Oral Health. 2025-3-10

[6]
Revolutionizing total hip arthroplasty: The role of artificial intelligence and machine learning.

J Exp Orthop. 2025-3-22

本文引用的文献

[1]
A practical guide to the implementation of artificial intelligence in orthopaedic research-Part 2: A technical introduction.

J Exp Orthop. 2024-5-7

[2]
Prediction of Retear After Arthroscopic Rotator Cuff Repair Based on Intraoperative Arthroscopic Images Using Deep Learning.

Am J Sports Med. 2023-9

[3]
Exploring the potential of ChatGPT as a supplementary tool for providing orthopaedic information.

Knee Surg Sports Traumatol Arthrosc. 2023-11

[4]
Ceiling Effect of the Combined Norwegian and Danish Knee Ligament Registers Limits Anterior Cruciate Ligament Reconstruction Outcome Prediction.

Am J Sports Med. 2023-7

[5]
ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations.

Front Artif Intell. 2023-5-4

[6]
What Does DALL-E 2 Know About Radiology?

J Med Internet Res. 2023-3-16

[7]
The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification.

BioData Min. 2023-2-17

[8]
What ChatGPT and generative AI mean for science.

Nature. 2023-2

[9]
Autoregressive models in environmental forecasting time series: a theoretical and application review.

Environ Sci Pollut Res Int. 2023-2

[10]
PROTEIN AI Advisor: A Knowledge-Based Recommendation Framework Using Expert-Validated Meals for Healthy Diets.

Nutrients. 2022-10-21

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索