Suppr超能文献

使用深度情境化表示法消除临床缩写词的歧义

Clinical Abbreviation Disambiguation Using Deep Contextualized Representation.

作者信息

Peng Mingkai, Quan Hude

机构信息

Department of Community Health Sciences, University of Calgary, Calgary, Canada.

出版信息

Stud Health Technol Inform. 2020 Jun 16;270:88-92. doi: 10.3233/SHTI200128.

Abstract

The objective of this study is to develop a method for clinical abbreviation disambiguation using deep contextualized representation and cluster analysis. We employed the pre-trained BioELMo language model to generate the contextualized word vector for abbreviations within each instance. Then principal component analysis was conducted on word vectors to reduce the dimension. K-Means cluster analysis was conducted for each abbreviation and the sense for a cluster was assigned based on the majority vote of annotations. Our method achieved an average accuracy of around 95% in 74 abbreviations. Simulation showed that each cluster required the annotation of 5 samples to determine its sense.

摘要

本研究的目的是开发一种使用深度语境化表示和聚类分析进行临床缩写消歧的方法。我们采用预训练的BioELMo语言模型为每个实例中的缩写生成语境化词向量。然后对词向量进行主成分分析以降低维度。对每个缩写进行K-Means聚类分析,并根据注释的多数投票为一个聚类分配语义。我们的方法在74个缩写中平均准确率达到了约95%。模拟表明,每个聚类需要标注5个样本才能确定其语义。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验