Gordon Claire L, Weng Chunhua
Department of Medicine, Columbia University Medical Center, 630 West 168th Street, New York, USA; Department of Biomedical Informatics, Columbia University Medical Center, 622 West 168th Street, New York, NY 10032, USA; Department of Medicine, University of Melbourne, Melbourne, VIC 3010, Australia.
Department of Biomedical Informatics, Columbia University Medical Center, 622 West 168th Street, New York, NY 10032, USA.
J Biomed Inform. 2015 Oct;57:42-52. doi: 10.1016/j.jbi.2015.07.014. Epub 2015 Jul 23.
A common bottleneck during ontology evaluation is knowledge acquisition from domain experts for gold standard creation. This paper contributes a novel semi-automated method for evaluating the concept coverage and accuracy of biomedical ontologies by complementing expert knowledge with knowledge automatically extracted from clinical practice guidelines and electronic health records, which minimizes reliance on expensive domain expertise for gold standards generation.
We developed a bacterial clinical infectious diseases ontology (BCIDO) to assist clinical infectious disease treatment decision support. Using a semi-automated method we integrated diverse knowledge sources, including publically available infectious disease guidelines from international repositories, electronic health records, and expert-generated infectious disease case scenarios, to generate a compendium of infectious disease knowledge and use it to evaluate the accuracy and coverage of BCIDO.
BCIDO has three classes (i.e., infectious disease, antibiotic, bacteria) containing 593 distinct concepts and 2345 distinct concept relationships. Our semi-automated method generated an ID knowledge compendium consisting of 637 concepts and 1554 concept relationships. Overall, BCIDO covered 79% (504/637) of the concepts and 89% (1378/1554) of the concept relationships in the ID compendium. BCIDO coverage of ID compendium concepts was 92% (121/131) for antibiotic, 80% (205/257) for infectious disease, and 72% (178/249) for bacteria. The low coverage of bacterial concepts in BCIDO was due to a difference in concept granularity between BCIDO and infectious disease guidelines. Guidelines and expert generated scenarios were the richest source of ID concepts and relationships while patient records provided relatively fewer concepts and relationships.
Our semi-automated method was cost-effective for generating a useful knowledge compendium with minimal reliance on domain experts. This method can be useful for continued development and evaluation of biomedical ontologies for better accuracy and coverage.
本体评估过程中的一个常见瓶颈是从领域专家那里获取知识以创建金标准。本文提出了一种新颖的半自动方法,通过将从临床实践指南和电子健康记录中自动提取的知识与专家知识相结合,来评估生物医学本体的概念覆盖范围和准确性,从而最大程度减少对生成金标准所需的昂贵领域专业知识的依赖。
我们开发了一个细菌临床传染病本体(BCIDO),以辅助临床传染病治疗决策支持。我们使用一种半自动方法整合了多种知识来源,包括来自国际知识库的公开可用传染病指南、电子健康记录以及专家生成的传染病病例场景,以生成一份传染病知识汇编,并用于评估BCIDO的准确性和覆盖范围。
BCIDO有三个类别(即传染病、抗生素、细菌),包含593个不同概念和2345个不同概念关系。我们的半自动方法生成了一个由637个概念和1554个概念关系组成的传染病知识汇编。总体而言,BCIDO覆盖了传染病知识汇编中79%(504/637)的概念和89%(1378/1554)的概念关系。BCIDO对传染病知识汇编概念的覆盖范围,抗生素为92%(121/131),传染病为80%(205/257),细菌为72%(178/249)。BCIDO中细菌概念的低覆盖率是由于BCIDO与传染病指南之间概念粒度的差异。指南和专家生成的场景是传染病概念和关系最丰富的来源,而患者记录提供的概念和关系相对较少。
我们的半自动方法在生成有用的知识汇编方面具有成本效益,且对领域专家的依赖最小。该方法可用于生物医学本体的持续开发和评估,以提高准确性和覆盖范围。