Suppr超能文献

靶向生物医学领域中词和概念嵌入的质量。

Quality of word and concept embeddings in targetted biomedical domains.

作者信息

Giancani Salvatore, Albertoni Riccardo, Catalano Chiara Eva

机构信息

Institut de Neurosciences de la Timone, Unité Mixte de Recherche 7289 Centre National de la Recherce Scientifique and Aix-Marseille Université, Faculty of Medicine, 27, Boulevard Jean Moulin, 13385 Marseille Cedex 05, France.

Istituto di Matematica Applicata e Tecnologie Informatiche, Consiglio Nazionale delle Ricerche, Via De Marini 16, 16149 Genova, Italy.

出版信息

Heliyon. 2023 Jun 2;9(6):e16818. doi: 10.1016/j.heliyon.2023.e16818. eCollection 2023 Jun.

Abstract

Embeddings are fundamental resources often reused for building intelligent systems in the biomedical context. As a result, evaluating the quality of previously trained embeddings and ensuring they cover the desired information is critical for the success of applications. This paper proposes a new evaluation methodology to test the coverage of embeddings against a targetted domain of interest. It defines measures to assess the terminology, similarity, and analogy coverage, which are core aspects of the embeddings. Then, it discusses the experimentation carried out on existing biomedical embeddings in the specific context of pulmonary diseases. The proposed methodology and measures are general and may be applied to any application domain.

摘要

嵌入是生物医学领域构建智能系统时经常重复使用的基础资源。因此,评估先前训练的嵌入的质量并确保它们涵盖所需信息对于应用的成功至关重要。本文提出了一种新的评估方法,以测试嵌入针对目标感兴趣领域的覆盖范围。它定义了评估术语、相似度和类比覆盖范围的度量,这些都是嵌入的核心方面。然后,它讨论了在肺部疾病的特定背景下对现有生物医学嵌入进行的实验。所提出的方法和度量是通用的,可应用于任何应用领域。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05ae/10272317/f71d665ee1b9/gr001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验