Children's Hospital Boston Informatics Program and Harvard Medical School, Boston, Massachusetts 02114, USA.
J Am Med Inform Assoc. 2011 Jul-Aug;18(4):459-65. doi: 10.1136/amiajnl-2011-000108. Epub 2011 Apr 1.
The long-term goal of this work is the automated discovery of anaphoric relations from the clinical narrative. The creation of a gold standard set from a cross-institutional corpus of clinical notes and high-level characteristics of that gold standard are described.
A standard methodology for annotation guideline development, gold standard annotations, and inter-annotator agreement (IAA) was used.
The gold standard annotations resulted in 7214 markables, 5992 pairs, and 1304 chains. Each report averaged 40 anaphoric markables, 33 pairs, and seven chains. The overall IAA is high on the Mayo dataset (0.6607), and moderate on the University of Pittsburgh Medical Center (UPMC) dataset (0.4072). The IAA between each annotator and the gold standard is high (Mayo: 0.7669, 0.7697, and 0.9021; UPMC: 0.6753 and 0.7138). These results imply a quality corpus feasible for system development. They also suggest the complementary nature of the annotations performed by the experts and the importance of an annotator team with diverse knowledge backgrounds.
Only one of the annotators had the linguistic background necessary for annotation of the linguistic attributes. The overall generalizability of the guidelines will be further strengthened by annotations of data from additional sites. This will increase the overall corpus size and the representation of each relation type.
The first step toward the development of an anaphoric relation resolver as part of a comprehensive natural language processing system geared specifically for the clinical narrative in the electronic medical record is described. The deidentified annotated corpus will be available to researchers.
这项工作的长期目标是从临床叙述中自动发现回指关系。描述了从跨机构临床笔记语料库创建黄金标准集以及该黄金标准的高级别特征。
使用了一种标准的注释指南开发、黄金标准注释和注释者间一致性(IAA)方法。
黄金标准注释产生了 7214 个可标记项、5992 对和 1304 个链。每份报告平均有 40 个回指可标记项、33 对和 7 个链。 Mayo 数据集的整体 IAA 较高(0.6607),而匹兹堡大学医学中心(UPMC)数据集的 IAA 适中(0.4072)。每个注释者与黄金标准之间的 IAA 较高(Mayo:0.7669、0.7697 和 0.9021;UPMC:0.6753 和 0.7138)。这些结果表明该语料库质量较高,适合系统开发。它们还表明,专家注释具有互补性,并且具有不同知识背景的注释者团队很重要。
只有一位注释者具有进行语言属性注释所需的语言学背景。通过对来自其他站点的数据进行注释,将进一步加强指南的总体概括性。这将增加整体语料库规模和每种关系类型的代表性。
描述了作为专门针对电子病历中的临床叙述的全面自然语言处理系统的一部分开发回指关系解析器的第一步。将提供经过身份识别的注释语料库供研究人员使用。