Laboratory of Informatics, Robotics and Microelectronics of Montpellier (LIRMM), University of Montpellier & CNRS, Montpellier 34090, France.
Faculty of Medicine, University of New South Wales, Sydney, New South Wales 2052, Australia.
Bioinformatics. 2018 Jun 1;34(11):1962-1965. doi: 10.1093/bioinformatics/bty009.
Second use of clinical data commonly involves annotating biomedical text with terminologies and ontologies. The National Center for Biomedical Ontology Annotator is a frequently used annotation service, originally designed for biomedical data, but not very suitable for clinical text annotation. In order to add new functionalities to the NCBO Annotator without hosting or modifying the original Web service, we have designed a proxy architecture that enables seamless extensions by pre-processing of the input text and parameters, and post processing of the annotations. We have then implemented enhanced functionalities for annotating and indexing free text such as: scoring, detection of context (negation, experiencer, temporality), new output formats and coarse-grained concept recognition (with UMLS Semantic Groups). In this paper, we present the NCBO Annotator+, a Web service which incorporates these new functionalities as well as a small set of evaluation results for concept recognition and clinical context detection on two standard evaluation tasks (Clef eHealth 2017, SemEval 2014).
The Annotator+ has been successfully integrated into the SIFR BioPortal platform-an implementation of NCBO BioPortal for French biomedical terminologies and ontologies-to annotate English text. A Web user interface is available for testing and ontology selection (http://bioportal.lirmm.fr/ncbo_annotatorplus); however the Annotator+ is meant to be used through the Web service application programming interface (http://services.bioportal.lirmm.fr/ncbo_annotatorplus). The code is openly available, and we also provide a Docker packaging to enable easy local deployment to process sensitive (e.g. clinical) data in-house (https://github.com/sifrproject).
Supplementary data are available at Bioinformatics online.
二次使用临床数据通常涉及使用术语和本体对生物医学文本进行注释。国家生物医学本体论注释中心是一个常用的注释服务,最初设计用于生物医学数据,但不太适合临床文本注释。为了在不托管或修改原始 Web 服务的情况下向 NCBO Annotator 添加新功能,我们设计了一个代理架构,通过对输入文本和参数进行预处理以及对注释进行后处理,实现无缝扩展。然后,我们为注释和索引自由文本实现了增强功能,例如:评分、上下文(否定、体验者、时态)检测、新的输出格式和粗粒度概念识别(使用 UMLS 语义组)。在本文中,我们介绍了 NCBO Annotator+,这是一个 Web 服务,它整合了这些新功能,以及在两个标准评估任务(Clef eHealth 2017、SemEval 2014)上进行概念识别和临床上下文检测的少量评估结果。
Annotator+已成功集成到 SIFR BioPortal 平台中——这是一个用于法国生物医学术语和本体的 NCBO BioPortal 实现,用于注释英语文本。一个 Web 用户界面可用于测试和本体选择(http://bioportal.lirmm.fr/ncbo_annotatorplus);但是,Annotator+ 旨在通过 Web 服务应用程序编程接口(http://services.bioportal.lirmm.fr/ncbo_annotatorplus)使用。代码是公开可用的,我们还提供了一个 Docker 打包,以便能够轻松地在本地部署以处理内部的敏感(例如临床)数据(https://github.com/sifrproject)。
补充数据可在 Bioinformatics 在线获得。