Xu Rong, Das Amar K, Garber Alan M
Center for Biomedical Informatics Research.
AMIA Annu Symp Proc. 2009 Nov 14;2009:709-13.
Definitions of medical concepts (e.g diseases, drugs) are essential background knowledge for researchers, clinicians and health care consumers. However, the rapid growth of biomedical research requires that such knowledge continually needs updating. To address this problem, we have developed an unsupervised pattern learning approach that extracts disease and drug definitions from automatically structured randomized clinical trial (RCT) abstracts. In addition, each extracted definition is semantically classified without relying on external medical knowledge. When used to identify definitions from 100 manually annotated RCT abstracts, our medical definition knowledge base has precision of 0.97, recall of 0.93, F1 of 0.94 and semantic classification accuracy of 0.96.
医学概念(如疾病、药物)的定义是研究人员、临床医生和医疗保健消费者必不可少的背景知识。然而,生物医学研究的快速发展要求此类知识需要不断更新。为了解决这个问题,我们开发了一种无监督模式学习方法,该方法从自动结构化的随机临床试验(RCT)摘要中提取疾病和药物定义。此外,每个提取的定义在不依赖外部医学知识的情况下进行语义分类。当用于从100篇人工标注的RCT摘要中识别定义时,我们的医学定义知识库的精确率为0.97,召回率为0.93,F1值为0.94,语义分类准确率为0.96。