Suppr超能文献

评估临床领域中的自然语言处理器。

Evaluating natural language processors in the clinical domain.

作者信息

Friedman C, Hripcsak G

机构信息

Department of Computer Science, Queens College CUNY, New York, USA.

出版信息

Methods Inf Med. 1998 Nov;37(4-5):334-44.

PMID:9865031
Abstract

Evaluating natural language processing (NLP) systems in the clinical domain is a difficult task which is important for advancement of the field. A number of NLP systems have been reported that extract information from free-text clinical reports, but not many of the systems have been evaluated. Those that were evaluated noted good performance measures but the results were often weakened by ineffective evaluation methods. In this paper we describe a set of criteria aimed at improving the quality of NLP evaluation studies. We present an overview of NLP evaluations in the clinical domain and also discuss the Message Understanding Conferences (MUC) [1-4]. Although these conferences constitute a series of NLP evaluation studies performed outside of the clinical domain, some of the results are relevant within medicine. In addition, we discuss a number of factors which contribute to the complexity that is inherent in the task of evaluating natural language systems.

摘要

评估临床领域的自然语言处理(NLP)系统是一项艰巨的任务,对该领域的发展至关重要。已有许多NLP系统被报道可从自由文本临床报告中提取信息,但对这些系统进行评估的并不多。那些经过评估的系统显示出良好的性能指标,但结果往往因无效的评估方法而受到影响。在本文中,我们描述了一套旨在提高NLP评估研究质量的标准。我们概述了临床领域的NLP评估,并讨论了信息理解会议(MUC)[1-4]。尽管这些会议构成了一系列在临床领域之外进行的NLP评估研究,但其中一些结果在医学领域具有相关性。此外,我们讨论了一些导致评估自然语言系统任务固有复杂性的因素。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验