人工智能在脑电图解读中的可推广性：一项外部验证研究。

Generalizability of electroencephalographic interpretation using artificial intelligence: An external validation study.

机构信息

Analytical Neurophysiology Lab, Montreal Neurological Institute and Hospital, Montreal, Quebec, Canada.

Neurophysiology Unit, Institute of Neurosurgery Dr. Asenjo, Santiago, Chile.

出版信息

Epilepsia. 2024 Oct;65(10):3028-3037. doi: 10.1111/epi.18082. Epub 2024 Aug 14.

DOI:10.1111/epi.18082

PMID:39141002

Abstract

OBJECTIVE

The automated interpretation of clinical electroencephalograms (EEGs) using artificial intelligence (AI) holds the potential to bridge the treatment gap in resource-limited settings and reduce the workload at specialized centers. However, to facilitate broad clinical implementation, it is essential to establish generalizability across diverse patient populations and equipment. We assessed whether SCORE-AI demonstrates diagnostic accuracy comparable to that of experts when applied to a geographically different patient population, recorded with distinct EEG equipment and technical settings.

METHODS

We assessed the diagnostic accuracy of a "fixed-and-frozen" AI model, using an independent dataset and external gold standard, and benchmarked it against three experts blinded to all other data. The dataset comprised 50% normal and 50% abnormal routine EEGs, equally distributed among the four major classes of EEG abnormalities (focal epileptiform, generalized epileptiform, focal nonepileptiform, and diffuse nonepileptiform). To assess diagnostic accuracy, we computed sensitivity, specificity, and accuracy of the AI model and the experts against the external gold standard.

RESULTS

We analyzed EEGs from 104 patients (64 females, median age = 38.6 [range = 16-91] years). SCORE-AI performed equally well compared to the experts, with an overall accuracy of 92% (95% confidence interval [CI] = 90%-94%) versus 94% (95% CI = 92%-96%). There was no significant difference between SCORE-AI and the experts for any metric or category. SCORE-AI performed well independently of the vigilance state (false classification during awake: 5/41 [12.2%], false classification during sleep: 2/11 [18.2%]; p = .63) and normal variants (false classification in presence of normal variants: 4/14 [28.6%], false classification in absence of normal variants: 3/38 [7.9%]; p = .07).

SIGNIFICANCE

SCORE-AI achieved diagnostic performance equal to human experts in an EEG dataset independent of the development dataset, in a geographically distinct patient population, recorded with different equipment and technical settings than the development dataset.

摘要

目的

使用人工智能（AI）自动解读临床脑电图（EEG），有可能在资源有限的环境中缩小治疗差距，并减轻专业中心的工作负担。然而，为了便于广泛的临床应用，必须在不同的患者群体和设备中建立可推广性。我们评估了 SCORE-AI 在应用于地理位置不同、记录设备和技术设置不同的患者群体时，其诊断准确性是否与专家相当。

方法

我们使用独立数据集和外部金标准评估了一种“固定冻结”人工智能模型的诊断准确性，并将其与三位对所有其他数据均不知情的专家进行了基准测试。该数据集由 50%正常和 50%异常常规 EEG 组成，均匀分布在 EEG 异常的四个主要类别（局灶性癫痫样、全面性癫痫样、局灶性非癫痫样和弥漫性非癫痫样）中。为了评估诊断准确性，我们计算了 AI 模型和专家对外部金标准的敏感性、特异性和准确性。

结果

我们分析了 104 名患者（64 名女性，中位年龄 38.6 [范围 16-91] 岁）的 EEG。SCORE-AI 的表现与专家相当，总准确率为 92%（95%置信区间 [CI] = 90%-94%），而专家的准确率为 94%（95% CI = 92%-96%）。在任何指标或类别中，SCORE-AI 与专家之间均无显著差异。SCORE-AI 独立于警觉状态（清醒时的错误分类：5/41 [12.2%]，睡眠时的错误分类：2/11 [18.2%]；p =.63）和正常变异（正常变异时的错误分类：4/14 [28.6%]，无正常变异时的错误分类：3/38 [7.9%]；p =.07）表现良好。

意义

SCORE-AI 在与开发数据集不同的地理位置、不同记录设备和技术设置的患者群体中，独立于开发数据集，在 EEG 数据集中实现了与人类专家相当的诊断性能。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

人工智能在脑电图解读中的可推广性：一项外部验证研究。

Generalizability of electroencephalographic interpretation using artificial intelligence: An external validation study.

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

SIGNIFICANCE

目的

方法

结果

意义

相似文献

引用本文的文献

人工智能在脑电图解读中的可推广性：一项外部验证研究。

Generalizability of electroencephalographic interpretation using artificial intelligence: An external validation study.

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

SIGNIFICANCE

目的

方法

结果

意义

相似文献

引用本文的文献