Ahlers Caroline B, Fiszman Marcelo, Demner-Fushman Dina, Lang François-Michel, Rindflesch Thomas C
Lister Hill National Center for Biomedical Communications, National Library of Medicine Bethesda, Maryland 20894, USA.
Pac Symp Biocomput. 2007:209-20.
We describe a natural language processing system (Enhanced SemRep) to identify core assertions on pharmacogenomics in Medline citations. Extracted information is represented as semantic predications covering a range of relations relevant to this domain. The specific relations addressed by the system provide greater precision than that achievable with methods that rely on entity co-occurrence. The development of Enhanced SemRep is based on the adaptation of an existing system and crucially depends on domain knowledge in the Unified Medical Language System. We provide a preliminary evaluation (55% recall and 73% precision) and discuss the potential of this system in assisting both clinical practice and scientific investigation.
我们描述了一种自然语言处理系统(增强版SemRep),用于识别医学文献数据库(Medline)引用中关于药物基因组学的核心断言。提取的信息以语义谓词的形式呈现,涵盖了与该领域相关的一系列关系。该系统所处理的特定关系比依赖实体共现的方法具有更高的精度。增强版SemRep的开发基于对现有系统的改编,并且关键依赖于统一医学语言系统中的领域知识。我们提供了初步评估(召回率55%,精确率73%),并讨论了该系统在辅助临床实践和科学研究方面的潜力。