Suppr超能文献

用于临床概念提取的标注语料库合并的可行性。

Feasibility of pooling annotated corpora for clinical concept extraction.

作者信息

Wagholikar Kavishwar, Torii Manabu, Jonnalagadda Siddhartha, Liu Hongfang

机构信息

Mayo Clinic, Rochester, MN;

出版信息

AMIA Jt Summits Transl Sci Proc. 2012;2012:38. Epub 2012 Mar 19.

Abstract

Availability of annotated corpora has facilitated application of machine learning algorithms to concept extraction from clinical notes. However, it is expensive to prepare annotated corpora in individual institutions, and pooling of annotated corpora from other institutions is a potential solution. In this paper we investigate whether pooling of corpora from two different sources, can improve performance and portability of resultant machine learning taggers for medical problem detection. Specifically, we pool corpora from 2010 i2b2/VA NLP challenge and Mayo Clinic Rochester, to evaluate taggers for recognition of medical problems. Contrary to our expectations, pooling of corpora is found to decrease the F1-score. We examine the annotation guidelines to identify factors for incompatibility of the corpora and suggest development of a standard annotation guideline by the clinical NLP community to allow compatibility of annotated corpora.

摘要

带注释语料库的可用性促进了机器学习算法在从临床记录中提取概念方面的应用。然而,在各个机构中准备带注释语料库成本高昂,而汇集其他机构的带注释语料库是一种潜在的解决方案。在本文中,我们研究了汇集来自两个不同来源的语料库是否可以提高用于医疗问题检测的机器学习标记器的性能和可移植性。具体而言,我们汇集了来自2010年i2b2/VA自然语言处理挑战赛和梅奥诊所罗切斯特分院的语料库,以评估用于识别医疗问题的标记器。与我们的预期相反,发现汇集语料库会降低F1分数。我们检查注释指南以确定语料库不兼容的因素,并建议临床自然语言处理社区制定标准注释指南,以实现带注释语料库的兼容性。

相似文献

引用本文的文献

3
Clinical concept extraction: A methodology review.临床概念提取:方法学综述。
J Biomed Inform. 2020 Sep;109:103526. doi: 10.1016/j.jbi.2020.103526. Epub 2020 Aug 6.
7
Identifying Peripheral Arterial Disease Cases Using Natural Language Processing of Clinical Notes.使用临床记录的自然语言处理识别外周动脉疾病病例
IEEE EMBS Int Conf Biomed Health Inform. 2016 Feb;2016:126-131. doi: 10.1109/BHI.2016.7455851. Epub 2016 Apr 21.
9

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验