Groupe de Recherche et d'accueil en Droit et Economie de la Santé (GRADES) Department, University Paris-Saclay, Orsay, France; Innovation Center for Medical Devices, Foch Hospital, 40 Rue Worth, 92150 Suresnes, France.
Pharmacy Department, Georges Pompidou European Hospital, AP-HP, 20 Rue Leblanc, 75015 Paris, France.
Artif Intell Med. 2023 Jun;140:102547. doi: 10.1016/j.artmed.2023.102547. Epub 2023 Apr 23.
INTRODUCTION: Artificial Intelligence-based Medical Devices (AI-based MDs) are experiencing exponential growth in healthcare. This study aimed to investigate whether current studies assessing AI contain the information required for health technology assessment (HTA) by HTA bodies. METHODS: We conducted a systematic literature review based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses methodology to extract articles published between 2016 and 2021 related to the assessment of AI-based MDs. Data extraction focused on study characteristics, technology, algorithms, comparators, and results. AI quality assessment and HTA scores were calculated to evaluate whether the items present in the included studies were concordant with the HTA requirements. We performed a linear regression for the HTA and AI scores with the explanatory variables of the impact factor, publication date, and medical specialty. We conducted a univariate analysis of the HTA score and a multivariate analysis of the AI score with an alpha risk of 5 %. RESULTS: Of 5578 retrieved records, 56 were included. The mean AI quality assessment score was 67 %; 32 % of articles had an AI quality score ≥ 70 %, 50 % had a score between 50 % and 70 %, and 18 % had a score under 50 %. The highest quality scores were observed for the study design (82 %) and optimisation (69 %) categories, whereas the scores were lowest in the clinical practice category (23 %). The mean HTA score was 52 % for all seven domains. 100 % of the studies assessed clinical effectiveness, whereas only 9 % evaluated safety, and 20 % evaluated economic issues. There was a statistically significant relationship between the impact factor and the HTA and AI scores (both p = 0.046). DISCUSSION: Clinical studies on AI-based MDs have limitations and often lack adapted, robust, and complete evidence. High-quality datasets are also required because the output data can only be trusted if the inputs are reliable. The existing assessment frameworks are not specifically designed to assess AI-based MDs. From the perspective of regulatory authorities, we suggest that these frameworks should be adapted to assess the interpretability, explainability, cybersecurity, and safety of ongoing updates. From the perspective of HTA agencies, we highlight that transparency, professional and patient acceptance, ethical issues, and organizational changes are required for the implementation of these devices. Economic assessments of AI should rely on a robust methodology (business impact or health economic models) to provide decision-makers with more reliable evidence. CONCLUSION: Currently, AI studies are insufficient to cover HTA prerequisites. HTA processes also need to be adapted because they do not consider the important specificities of AI-based MDs. Specific HTA workflows and accurate assessment tools should be designed to standardise evaluations, generate reliable evidence, and create confidence.
简介:基于人工智能的医疗器械(AI 医疗器械)在医疗保健领域正呈指数级增长。本研究旨在调查当前评估 AI 的研究是否包含健康技术评估(HTA)机构所需的信息。
方法:我们按照系统评价和荟萃分析的首选报告项目进行了系统文献综述,以提取 2016 年至 2021 年间与 AI 医疗器械评估相关的文章。数据提取侧重于研究特征、技术、算法、对照和结果。计算 AI 质量评估和 HTA 评分,以评估纳入研究中存在的项目是否与 HTA 要求一致。我们对 HTA 和 AI 评分与影响因素、出版日期和医学专业的解释变量进行了线性回归。我们对 HTA 评分进行了单变量分析,对 AI 评分进行了多变量分析,alpha 风险为 5%。
结果:在 5578 条检索记录中,有 56 条被纳入。AI 质量评估平均得分为 67%;32%的文章 AI 质量评分≥70%,50%的文章评分在 50%至 70%之间,18%的文章评分低于 50%。研究设计(82%)和优化(69%)类别的得分最高,而临床实践类别的得分最低(23%)。七个领域的 HTA 平均得分为 52%。所有研究均评估了临床效果,只有 9%评估了安全性,20%评估了经济问题。影响因素与 HTA 和 AI 评分之间存在统计学显著关系(均 p=0.046)。
讨论:基于 AI 的医疗器械的临床研究存在局限性,且往往缺乏适应性强、稳健和完整的证据。还需要高质量的数据集,因为只有输入可靠,输出数据才能被信任。现有的评估框架并非专门用于评估基于 AI 的医疗器械。从监管机构的角度来看,我们建议对这些框架进行调整,以评估正在进行的更新的可解释性、可解释性、网络安全和安全性。从 HTA 机构的角度来看,我们强调需要透明度、专业和患者接受度、伦理问题和组织变革来实施这些设备。AI 的经济评估应依赖稳健的方法(业务影响或健康经济模型),为决策者提供更可靠的证据。
结论:目前,AI 研究不足以满足 HTA 的前提条件。HTA 流程也需要进行调整,因为它们没有考虑到 AI 医疗器械的重要特殊性。应设计特定的 HTA 工作流程和准确的评估工具,以标准化评估、生成可靠的证据并建立信心。
Health Technol Assess. 2001
Health Technol Assess. 2024-10
Cochrane Database Syst Rev. 2022-5-20
Cochrane Database Syst Rev. 2021-4-19
Health Technol Assess. 2006-9
Cochrane Database Syst Rev. 2017-12-22
Cochrane Database Syst Rev. 2020-1-9
JBI Database System Rev Implement Rep. 2016-4
Policy Stud. 2025-4-28
Mayo Clin Proc Digit Health. 2023-8-8
Int J Technol Assess Health Care. 2024-11-21
Int J Technol Assess Health Care. 2024-11-5
Orphanet J Rare Dis. 2024-1-25