Chung Jeanhee, Murphy Shawn
Laboratory of Computer Science, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA.
AMIA Annu Symp Proc. 2005;2005:131-5.
The task of gathering detailed patient information from narrative text presents a significant barrier to clinical research. A prototype information extraction system was developed to identify concepts and their associated values from narrative echocardiogram reports. The system uses a Unified Medical Language System compatible architecture and takes advantage of canonical language use patterns to identify sentence templates with which concepts and their related values can be identified. The data extracted from this system will be used to enrich an existing database used by clinical researchers in a large university healthcare system to identify potential research candidates fulfilling clinical inclusion criteria. The system was developed and evaluated using ten clinical concepts. Concept-value pairs extracted by the system were compared with findings extracted manually by the author. The system was able to recall 78% [95%CI, 76-80%] of the relevant findings, with a precision of 99% [95%CI, 98-99%].
从叙述性文本中收集详细的患者信息这一任务给临床研究带来了重大障碍。开发了一个原型信息提取系统,用于从叙述性超声心动图报告中识别概念及其相关值。该系统采用与统一医学语言系统兼容的架构,并利用规范的语言使用模式来识别句子模板,通过这些模板可以识别概念及其相关值。从该系统提取的数据将用于充实一所大型大学医疗系统中临床研究人员使用的现有数据库,以识别符合临床纳入标准的潜在研究对象。该系统是使用十个临床概念开发和评估的。将系统提取的概念-值对与作者手动提取的结果进行了比较。该系统能够召回78%[95%置信区间,76 - 80%]的相关结果,精确率为99%[95%置信区间,98 - 99%]。