Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA; Computational Neuroscience Outcomes Center (CNOC), Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, 75 Francis St, Boston, MA 02115, USA.
Computational Neuroscience Outcomes Center (CNOC), Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, 75 Francis St, Boston, MA 02115, USA; Department of Neurosurgery, Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, Netherlands.
Artif Intell Med. 2023 Sep;143:102607. doi: 10.1016/j.artmed.2023.102607. Epub 2023 Jun 7.
Over the past decade, machine learning (ML) and artificial intelligence (AI) have become increasingly prevalent in the medical field. In the United States, the Food and Drug Administration (FDA) is responsible for regulating AI algorithms as "medical devices" to ensure patient safety. However, recent work has shown that the FDA approval process may be deficient. In this study, we evaluate the evidence supporting FDA-approved neuroalgorithms, the subset of machine learning algorithms with applications in the central nervous system (CNS), through a systematic review of the primary literature. Articles covering the 53 FDA-approved algorithms with applications in the CNS published in PubMed, EMBASE, Google Scholar and Scopus between database inception and January 25, 2022 were queried. Initial searches identified 1505 studies, of which 92 articles met the criteria for extraction and inclusion. Studies were identified for 26 of the 53 neuroalgorithms, of which 10 algorithms had only a single peer-reviewed publication. Performance metrics were available for 15 algorithms, external validation studies were available for 24 algorithms, and studies exploring the use of algorithms in clinical practice were available for 7 algorithms. Papers studying the clinical utility of these algorithms focused on three domains: workflow efficiency, cost savings, and clinical outcomes. Our analysis suggests that there is a meaningful gap between the FDA approval of machine learning algorithms and their clinical utilization. There appears to be room for process improvement by implementation of the following recommendations: the provision of compelling evidence that algorithms perform as intended, mandating minimum sample sizes, reporting of a predefined set of performance metrics for all algorithms and clinical application of algorithms prior to widespread use. This work will serve as a baseline for future research into the ideal regulatory framework for AI applications worldwide.
在过去的十年中,机器学习(ML)和人工智能(AI)在医学领域变得越来越普遍。在美国,食品和药物管理局(FDA)负责将 AI 算法作为“医疗器械”进行监管,以确保患者安全。然而,最近的研究表明,FDA 的批准程序可能存在缺陷。在这项研究中,我们通过对主要文献进行系统评价,评估了支持 FDA 批准的神经算法的证据,神经算法是机器学习算法中应用于中枢神经系统(CNS)的子集。在 PubMed、EMBASE、Google Scholar 和 Scopus 数据库成立至 2022 年 1 月 25 日期间,查询了发表在 CNS 中的 53 种 FDA 批准的神经算法的应用的文献。最初的搜索确定了 1505 项研究,其中 92 篇文章符合提取和纳入标准。确定了 53 种神经算法中的 26 种算法的研究,其中 10 种算法只有一篇同行评议的出版物。有 15 种算法的性能指标可用,24 种算法有外部验证研究,7 种算法有探索算法在临床实践中应用的研究。研究这些算法临床实用性的论文集中在三个领域:工作流程效率、成本节约和临床结果。我们的分析表明,FDA 批准机器学习算法与它们的临床应用之间存在着显著的差距。通过实施以下建议,可以为流程改进留出空间:提供算法按预期运行的有力证据,强制规定最小样本量,报告所有算法的预定性能指标集,并在广泛使用之前在临床应用算法。这项工作将为未来研究全球人工智能应用的理想监管框架提供基础。