文献检索，用中文搜 PubMed

BACKGROUND

Artificial intelligence (AI) has the potential to revolutionize health care by enhancing both clinical outcomes and operational efficiency. However, its clinical adoption has been slower than anticipated, largely due to the absence of comprehensive evaluation frameworks. Existing frameworks remain insufficient and tend to emphasize technical metrics such as accuracy and validation, while overlooking critical real-world factors such as clinical impact, integration, and economic sustainability. This narrow focus prevents AI tools from being effectively implemented, limiting their broader impact and long-term viability in clinical practice.

OBJECTIVE

This study aimed to create a framework for assessing AI in health care, extending beyond technical metrics to incorporate social and organizational dimensions. The framework was developed by systematically reviewing, analyzing, and synthesizing the evaluation criteria necessary for successful implementation, focusing on the long-term real-world impact of AI in clinical practice.

METHODS

A search was performed in July 2024 across the PubMed, Cochrane, Scopus, and IEEE Xplore databases to identify relevant studies published in English between January 2019 and mid-July 2024, yielding 3528 results, among which 44 studies met the inclusion criteria. The systematic review followed PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) guidelines and the Cochrane Handbook for Systematic Reviews. Data were analyzed using NVivo through thematic analysis and narrative synthesis to identify key emergent themes in the studies.

RESULTS

By synthesizing the included studies, we developed a framework that goes beyond the traditional focus on technical metrics or study-level methodologies. It integrates clinical context and real-world implementation factors, offering a more comprehensive approach to evaluating AI tools. With our focus on assessing the long-term real-world impact of AI technologies in health care, we named the framework AI for IMPACTS. The criteria are organized into seven key clusters, each corresponding to a letter in the acronym: (1) I-integration, interoperability, and workflow; (2) M-monitoring, governance, and accountability; (3) P-performance and quality metrics; (4) A-acceptability, trust, and training; (5) C-cost and economic evaluation; (6) T-technological safety and transparency; and (7) S-scalability and impact. These are further broken down into 28 specific subcriteria.

CONCLUSIONS

The AI for IMPACTS framework offers a holistic approach to evaluate the long-term real-world impact of AI tools in the heterogeneous and challenging health care context and lays the groundwork for further validation through expert consensus and testing of the framework in real-world health care settings. It is important to emphasize that multidisciplinary expertise is essential for assessment, yet many assessors lack the necessary training. In addition, traditional evaluation methods struggle to keep pace with AI's rapid development. To ensure successful AI integration, flexible, fast-tracked assessment processes and proper assessor training are needed to maintain rigorous standards while adapting to AI's dynamic evolution.

TRIAL REGISTRATION

reviewregistry1859; https://tinyurl.com/ysn2d7sh.

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

TRIAL REGISTRATION

reviewregistry1859; https://tinyurl.com/ysn2d7sh.

背景

人工智能（AI）有潜力通过改善临床结果和运营效率来彻底改变医疗保健行业。然而，其在临床中的应用速度比预期要慢，这主要是由于缺乏全面的评估框架。现有的框架仍然不够完善，往往侧重于技术指标，如准确性和验证，而忽视了关键的现实世界因素，如临床影响、整合和经济可持续性。这种狭隘的关注点阻碍了人工智能工具的有效实施，限制了它们在临床实践中的更广泛影响和长期可行性。

目的

本研究旨在创建一个评估医疗保健领域人工智能的框架，超越技术指标，纳入社会和组织层面。该框架是通过系统地审查、分析和综合成功实施所需的评估标准而制定的，重点关注人工智能在临床实践中的长期现实世界影响。

方法

2024年7月在PubMed、Cochrane、Scopus和IEEE Xplore数据库中进行了检索，以识别2019年1月至2024年7月中旬期间发表的英文相关研究，共获得3528条结果，其中44项研究符合纳入标准。系统评价遵循PRISMA（系统评价和Meta分析的首选报告项目）指南和Cochrane系统评价手册。使用NVivo通过主题分析和叙述性综合对数据进行分析，以确定研究中的关键新兴主题。

结果

通过综合纳入的研究，我们开发了一个框架，该框架超越了传统上对技术指标或研究层面方法的关注。它整合了临床背景和现实世界的实施因素，为评估人工智能工具提供了更全面的方法。由于我们专注于评估人工智能技术在医疗保健中的长期现实世界影响，我们将该框架命名为IMPACTS人工智能。这些标准被组织成七个关键集群，每个集群对应首字母缩写中的一个字母：（1）I-整合、互操作性和工作流程；（2）M-监测、治理和问责制；（3）P-性能和质量指标；（4）A-可接受性、信任和培训；（5）C-成本和经济评估；（6）T-技术安全性和透明度；（7）S-可扩展性和影响。这些进一步细分为28个具体的子标准。

结论

IMPACTS人工智能框架提供了一种整体方法，用于评估人工智能工具在异质且具有挑战性的医疗保健环境中的长期现实世界影响，并为通过专家共识和在现实世界医疗保健环境中对该框架进行测试来进一步验证奠定了基础。需要强调的是，多学科专业知识对于评估至关重要，但许多评估人员缺乏必要的培训。此外，传统评估方法难以跟上人工智能的快速发展。为确保人工智能的成功整合，需要灵活、快速的评估流程和适当的评估人员培训，以在适应人工智能动态发展的同时保持严格的标准。

试验注册

reviewregistry1859；https://tinyurl.com/ysn2d7sh。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于评估人工智能驱动的临床医生工具长期现实世界影响的AI for IMPACTS框架：系统评价与叙述性综合分析

AI for IMPACTS Framework for Evaluating the Long-Term Real-World Impacts of AI-Powered Clinician Tools: Systematic Review and Narrative Synthesis.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

TRIAL REGISTRATION

相似文献

引用本文的文献

本文引用的文献

用于评估人工智能驱动的临床医生工具长期现实世界影响的AI for IMPACTS框架：系统评价与叙述性综合分析

AI for IMPACTS Framework for Evaluating the Long-Term Real-World Impacts of AI-Powered Clinician Tools: Systematic Review and Narrative Synthesis.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

TRIAL REGISTRATION

背景

目的

方法

结果

结论

试验注册

相似文献

引用本文的文献

本文引用的文献