Suppr超能文献

临床领域句子边界检测的定量与定性评估

A Quantitative and Qualitative Evaluation of Sentence Boundary Detection for the Clinical Domain.

作者信息

Griffis Denis, Shivade Chaitanya, Fosler-Lussier Eric, Lai Albert M

机构信息

Department of Computer Science and Engineering,; National Institutes of Health, Rehabilitation Medicine Department, Mark O. Hatfield Clinical Research Center, Bethesda, MD.

Department of Computer Science and Engineering.

出版信息

AMIA Jt Summits Transl Sci Proc. 2016 Jul 20;2016:88-97. eCollection 2016.

Abstract

Sentence boundary detection (SBD) is a critical preprocessing task for many natural language processing (NLP) applications. However, there has been little work on evaluating how well existing methods for SBD perform in the clinical domain. We evaluate five popular off-the-shelf NLP toolkits on the task of SBD in various kinds of text using a diverse set of corpora, including the GENIA corpus of biomedical abstracts, a corpus of clinical notes used in the 2010 i2b2 shared task, and two general-domain corpora (the British National Corpus and Switchboard). We find that, with the exception of the cTAKES system, the toolkits we evaluate perform noticeably worse on clinical text than on general-domain text. We identify and discuss major classes of errors, and suggest directions for future work to improve SBD methods in the clinical domain. We also make the code used for SBD evaluation in this paper available for download at http://github.com/drgriffis/SBD-Evaluation.

摘要

句子边界检测(SBD)是许多自然语言处理(NLP)应用中的一项关键预处理任务。然而,关于评估现有SBD方法在临床领域的表现如何,相关研究却很少。我们使用多种语料库,包括生物医学摘要的GENIA语料库、2010年i2b2共享任务中使用的临床笔记语料库以及两个通用领域语料库(英国国家语料库和Switchboard),对五个流行的现成NLP工具包在各种文本的SBD任务上进行了评估。我们发现,除了cTAKES系统外,我们评估的工具包在临床文本上的表现明显比在通用领域文本上更差。我们识别并讨论了主要的错误类别,并为未来改进临床领域SBD方法的工作提出了方向。我们还将本文中用于SBD评估的代码发布在http://github.com/drgriffis/SBD-Evaluation上以供下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/835c/5001746/85ce30ede46e/2382904f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验