Department of Informatics, University at Albany, SUNY, Albany, New York, USA.
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):806-13. doi: 10.1136/amiajnl-2013-001628. Epub 2013 Apr 5.
The Sixth Informatics for Integrating Biology and the Bedside (i2b2) Natural Language Processing Challenge for Clinical Records focused on the temporal relations in clinical narratives. The organizers provided the research community with a corpus of discharge summaries annotated with temporal information, to be used for the development and evaluation of temporal reasoning systems. 18 teams from around the world participated in the challenge. During the workshop, participating teams presented comprehensive reviews and analysis of their systems, and outlined future research directions suggested by the challenge contributions.
The challenge evaluated systems on the information extraction tasks that targeted: (1) clinically significant events, including both clinical concepts such as problems, tests, treatments, and clinical departments, and events relevant to the patient's clinical timeline, such as admissions, transfers between departments, etc; (2) temporal expressions, referring to the dates, times, durations, or frequencies phrases in the clinical text. The values of the extracted temporal expressions had to be normalized to an ISO specification standard; and (3) temporal relations, between the clinical events and temporal expressions. Participants determined pairs of events and temporal expressions that exhibited a temporal relation, and identified the temporal relation between them.
For event detection, statistical machine learning (ML) methods consistently showed superior performance. While ML and rule based methods seemed to detect temporal expressions equally well, the best systems overwhelmingly adopted a rule based approach for value normalization. For temporal relation classification, the systems using hybrid approaches that combined ML and heuristics based methods produced the best results.
第六届整合生物学和床边信息学(i2b2)自然语言处理临床记录挑战赛专注于临床叙述中的时间关系。组织者为研究界提供了一个带有时间信息注释的出院小结语料库,用于开发和评估时间推理系统。来自世界各地的 18 个团队参加了此次挑战赛。在研讨会上,参赛团队全面介绍并分析了他们的系统,并概述了挑战赛提出的未来研究方向。
该挑战赛评估了针对以下信息提取任务的系统:(1)临床重要事件,包括临床概念(如问题、检查、治疗和临床科室)和与患者临床时间线相关的事件(如入院、科室间转科等);(2)时间表达式,指临床文本中的日期、时间、持续时间或频率短语。提取的时间表达式的值必须规范化为 ISO 规范标准;(3)临床事件和时间表达式之间的时间关系。参与者确定表现出时间关系的事件和时间表达式对,并确定它们之间的时间关系。
在事件检测方面,统计机器学习(ML)方法的表现始终优于其他方法。虽然 ML 和基于规则的方法在检测时间表达式方面表现相当,但最好的系统压倒性地采用基于规则的方法进行值归一化。对于时间关系分类,采用结合了 ML 和基于启发式的混合方法的系统产生了最佳结果。