Suppr超能文献

队列研究数据的差异会影响用于痴呆症预测诊断的人工智能模型的外部验证——对转化为临床实践的启示。

Differences in cohort study data affect external validation of artificial intelligence models for predictive diagnostics of dementia - lessons for translation into clinical practice.

作者信息

Birkenbihl Colin, Emon Mohammad Asif, Vrooman Henri, Westwood Sarah, Lovestone Simon, Hofmann-Apitius Martin, Fröhlich Holger

机构信息

Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany.

Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115 Bonn, Germany.

出版信息

EPMA J. 2020 Jun 22;11(3):367-376. doi: 10.1007/s13167-020-00216-z. eCollection 2020 Sep.

Abstract

Artificial intelligence (AI) approaches pose a great opportunity for individualized, pre-symptomatic disease diagnosis which plays a key role in the context of personalized, predictive, and finally preventive medicine (PPPM). However, to translate PPPM into clinical practice, it is of utmost importance that AI-based models are carefully validated. The validation process comprises several steps, one of which is testing the model on patient-level data from an independent clinical cohort study. However, recruitment criteria can bias statistical analysis of cohort study data and impede model application beyond the training data. To evaluate whether and how data from independent clinical cohort studies differ from each other, this study systematically compares the datasets collected from two major dementia cohorts, namely, the Alzheimer's Disease Neuroimaging Initiative (ADNI) and AddNeuroMed. The presented comparison was conducted on individual feature level and revealed significant differences among both cohorts. Such systematic deviations can potentially hamper the generalizability of results which were based on a single cohort dataset. Despite identified differences, validation of a previously published, ADNI trained model for prediction of personalized dementia risk scores on 244 AddNeuroMed subjects was successful: External validation resulted in a high prediction performance of above 80% area under receiver operator characteristic curve up to 6 years before dementia diagnosis. Propensity score matching identified a subset of patients from AddNeuroMed, which showed significantly smaller demographic differences to ADNI. For these patients, an even higher prediction performance was achieved, which demonstrates the influence systematic differences between cohorts can have on validation results. In conclusion, this study exposes challenges in external validation of AI models on cohort study data and is one of the rare cases in the neurology field in which such external validation was performed. The presented model represents a proof of concept that reliable models for personalized predictive diagnostics are feasible, which, in turn, could lead to adequate disease prevention and hereby enable the PPPM paradigm in the dementia field.

摘要

人工智能(AI)方法为个性化的症状前疾病诊断带来了巨大机遇,这在个性化、预测性以及最终的预防性医学(PPPM)背景下发挥着关键作用。然而,要将PPPM转化为临床实践,基于AI的模型经过仔细验证至关重要。验证过程包括几个步骤,其中之一是在来自独立临床队列研究的患者层面数据上测试模型。然而,招募标准可能会使队列研究数据的统计分析产生偏差,并阻碍模型在训练数据之外的应用。为了评估独立临床队列研究的数据是否以及如何相互不同,本研究系统地比较了从两个主要痴呆症队列收集的数据集,即阿尔茨海默病神经影像倡议(ADNI)和AddNeuroMed。所呈现的比较是在个体特征层面进行的,结果显示两个队列之间存在显著差异。这种系统偏差可能会妨碍基于单个队列数据集的结果的可推广性。尽管存在已识别的差异,但对一个先前发表的、在ADNI上训练的用于预测244名AddNeuroMed受试者个性化痴呆风险评分的模型进行验证是成功的:外部验证在痴呆症诊断前长达6年的时间里,受试者工作特征曲线下面积的预测性能高达80%以上。倾向得分匹配确定了AddNeuroMed中的一部分患者,这些患者与ADNI的人口统计学差异显著更小。对于这些患者,实现了更高的预测性能,这证明了队列之间的系统差异对验证结果可能产生的影响。总之,本研究揭示了在队列研究数据上对AI模型进行外部验证时面临的挑战,并且是神经学领域中进行此类外部验证的少数案例之一。所呈现的模型代表了一个概念验证,即用于个性化预测诊断的可靠模型是可行的,这反过来可能导致适当的疾病预防,并从而在痴呆症领域实现PPPM范式。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90f1/7429672/2d2199dcc68d/13167_2020_216_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验