Suppr超能文献

使用 AD-Mapper 实现阿尔茨海默病数据集的语义协调。

Semantic Harmonization of Alzheimer's Disease Datasets Using AD-Mapper.

机构信息

Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany.

Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany.

出版信息

J Alzheimers Dis. 2024;99(4):1409-1423. doi: 10.3233/JAD-240116.

Abstract

BACKGROUND

Despite numerous past endeavors for the semantic harmonization of Alzheimer's disease (AD) cohort studies, an automatic tool has yet to be developed.

OBJECTIVE

As cohort studies form the basis of data-driven analysis, harmonizing them is crucial for cross-cohort analysis. We aimed to accelerate this task by constructing an automatic harmonization tool.

METHODS

We created a common data model (CDM) through cross-mapping data from 20 cohorts, three CDMs, and ontology terms, which was then used to fine-tune a BioBERT model. Finally, we evaluated the model using three previously unseen cohorts and compared its performance to a string-matching baseline model.

RESULTS

Here, we present our AD-Mapper interface for automatic harmonization of AD cohort studies, which outperformed a string-matching baseline on previously unseen cohort studies. We showcase our CDM comprising 1218 unique variables.

CONCLUSION

AD-Mapper leverages semantic similarities in naming conventions across cohorts to improve mapping performance.

摘要

背景

尽管过去有许多努力致力于阿尔茨海默病(AD)队列研究的语义协调,但尚未开发出自动工具。

目的

由于队列研究构成了数据驱动分析的基础,因此协调它们对于跨队列分析至关重要。我们旨在通过构建自动协调工具来加速这项任务。

方法

我们通过从 20 个队列、3 个 CDM 和本体论术语中交叉映射数据创建了一个通用数据模型(CDM),然后使用该模型对 BioBERT 模型进行微调。最后,我们使用三个以前未见过的队列来评估模型,并将其性能与字符串匹配基线模型进行比较。

结果

在此,我们展示了用于自动协调 AD 队列研究的 AD-Mapper 界面,该界面在以前未见过的队列研究中优于字符串匹配基线模型。我们展示了包含 1218 个独特变量的 CDM。

结论

AD-Mapper 利用了队列命名约定中的语义相似性来提高映射性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7b9/11191441/0a2349a34796/jad-99-jad240116-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验