Suppr超能文献

构建临床记录中缩写词语义清单的方法。

Methods for building sense inventories of abbreviations in clinical notes.

作者信息

Xu Hua, Stetson Peter D, Friedman Carol

机构信息

Department of Biomedical Informatics, Columbia University, New York, NY, USA.

出版信息

J Am Med Inform Assoc. 2009 Jan-Feb;16(1):103-8. doi: 10.1197/jamia.M2927. Epub 2008 Oct 24.

Abstract

OBJECTIVE

To develop methods for building corpus-specific sense inventories of abbreviations occurring in clinical documents.

DESIGN

A corpus of internal medicine admission notes was collected and instances of each clinical abbreviation in the corpus were clustered to different sense clusters. One instance from each cluster was manually annotated to generate a final list of senses. Two clustering-based methods (Expectation Maximization--EM and Farthest First--FF) and one random sampling method for sense detection were evaluated using a set of 12 clinical abbreviations.

MEASUREMENTS

The clustering-based sense detection methods were evaluated using a set of clinical abbreviations that were manually sense annotated. "Sense Completeness" and "Annotation Cost" were used to measure the performance of different methods. Clustering error rates were also reported for different clustering algorithms.

RESULTS

A clustering-based semi-automated method was developed to build corpus-specific sense inventories for abbreviations in hospital admission notes. Evaluation demonstrated that this method could largely reduce manual annotation cost and increase the completeness of sense inventories when compared with a manual annotation method using random samples.

CONCLUSION

The authors developed an effective clustering-based method for building corpus-specific sense inventories for abbreviations in a clinical corpus. To the best of the authors knowledge, this is the first time clustering technologies have been used to help building sense inventories of abbreviations in clinical text. The results demonstrated that the clustering-based method performed better than the manual annotation method using random samples for the task of building sense inventories of clinical abbreviations.

摘要

目的

开发用于构建临床文档中出现的缩写词的特定语料库词义清单的方法。

设计

收集了一组内科入院记录语料库,并将语料库中每个临床缩写词的实例聚类到不同的词义簇中。从每个簇中手动标注一个实例,以生成最终的词义列表。使用一组12个临床缩写词对两种基于聚类的方法(期望最大化算法——EM和最远优先算法——FF)以及一种用于词义检测的随机抽样方法进行了评估。

测量指标

使用一组经过人工词义标注的临床缩写词对基于聚类的词义检测方法进行评估。“词义完整性”和“标注成本”用于衡量不同方法的性能。还报告了不同聚类算法的聚类错误率。

结果

开发了一种基于聚类的半自动方法,用于构建医院入院记录中缩写词的特定语料库词义清单。评估表明,与使用随机样本的人工标注方法相比,该方法可以大大降低人工标注成本,并提高词义清单的完整性。

结论

作者开发了一种有效的基于聚类的方法,用于构建临床语料库中缩写词的特定语料库词义清单。据作者所知,这是首次使用聚类技术来帮助构建临床文本中缩写词的词义清单。结果表明,在构建临床缩写词词义清单的任务中,基于聚类的方法比使用随机样本的人工标注方法表现更好。

相似文献

1
Methods for building sense inventories of abbreviations in clinical notes.构建临床记录中缩写词语义清单的方法。
J Am Med Inform Assoc. 2009 Jan-Feb;16(1):103-8. doi: 10.1197/jamia.M2927. Epub 2008 Oct 24.
7
A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time.实时临床缩写词消歧的初步研究
Appl Clin Inform. 2015 Jun 3;6(2):364-74. doi: 10.4338/ACI-2014-10-RA-0088. eCollection 2015.

引用本文的文献

7
Differentiating Sense through Semantic Interaction Data.通过语义交互数据区分意义。
AMIA Annu Symp Proc. 2017 Feb 10;2016:1238-1247. eCollection 2016.

本文引用的文献

2
ADAM: another database of abbreviations in MEDLINE.ADAM:医学在线数据库(MEDLINE)中的另一个缩写词数据库。
Bioinformatics. 2006 Nov 15;22(22):2813-8. doi: 10.1093/bioinformatics/btl480. Epub 2006 Sep 18.
3
Electronic discharge summaries.电子出院小结。
AMIA Annu Symp Proc. 2005;2005:1121.
4
A multi-aspect comparison study of supervised word sense disambiguation.监督式词义消歧的多方面比较研究
J Am Med Inform Assoc. 2004 Jul-Aug;11(4):320-31. doi: 10.1197/jamia.M1533. Epub 2004 Apr 2.
5
SaRAD: a Simple and Robust Abbreviation Dictionary.SaRAD:一个简单且强大的缩写词典。
Bioinformatics. 2004 Mar 1;20(4):527-33. doi: 10.1093/bioinformatics/btg439. Epub 2004 Jan 22.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验