Pérez-Pérez Martin, Pérez-Rodríguez Gael, Blanco-Míguez Aitor, Fdez-Riverola Florentino, Valencia Alfonso, Krallinger Martin, Lourenço Anália
Department of Computer Science, ESEI, University of Vigo, Campus As Lagoas, 32004, Ourense, Spain.
The Biomedical Research Centre (CINBIO), Campus Universitario Lagoas-Marcosende, 36310, Vigo, Spain.
J Cheminform. 2019 Jun 24;11(1):42. doi: 10.1186/s13321-019-0363-6.
Shared tasks and community challenges represent key instruments to promote research, collaboration and determine the state of the art of biomedical and chemical text mining technologies. Traditionally, such tasks relied on the comparison of automatically generated results against a so-called Gold Standard dataset of manually labelled textual data, regardless of efficiency and robustness of the underlying implementations. Due to the rapid growth of unstructured data collections, including patent databases and particularly the scientific literature, there is a pressing need to generate, assess and expose robust big data text mining solutions to semantically enrich documents in real time. To address this pressing need, a novel track called "Technical interoperability and performance of annotation servers" was launched under the umbrella of the BioCreative text mining evaluation effort. The aim of this track was to enable the continuous assessment of technical aspects of text annotation web servers, specifically of online biomedical named entity recognition systems of interest for medicinal chemistry applications.
A total of 15 out of 26 registered teams successfully implemented online annotation servers. They returned predictions during a two-month period in predefined formats and were evaluated through the BeCalm evaluation platform, specifically developed for this track. The track encompassed three levels of evaluation, i.e. data format considerations, technical metrics and functional specifications. Participating annotation servers were implemented in seven different programming languages and covered 12 general entity types. The continuous evaluation of server responses accounted for testing periods of low activity and moderate to high activity, encompassing overall 4,092,502 requests from three different document provider settings. The median response time was below 3.74 s, with a median of 10 annotations/document. Most of the servers showed great reliability and stability, being able to process over 100,000 requests in a 5-day period.
The presented track was a novel experimental task that systematically evaluated the technical performance aspects of online entity recognition systems. It raised the interest of a significant number of participants. Future editions of the competition will address the ability to process documents in bulk as well as to annotate full-text documents.
共享任务和社区挑战是促进研究、合作以及确定生物医学和化学文本挖掘技术发展水平的关键手段。传统上,此类任务依赖于将自动生成的结果与手动标注文本数据的所谓“金标准数据集”进行比较,而不考虑底层实现的效率和稳健性。由于包括专利数据库尤其是科学文献在内的非结构化数据集合的快速增长,迫切需要生成、评估并展示强大的大数据文本挖掘解决方案,以便实时对文档进行语义丰富。为满足这一迫切需求,在生物创意文本挖掘评估工作的框架下启动了一个名为“注释服务器的技术互操作性和性能”的新赛道。该赛道的目的是能够持续评估文本注释网络服务器的技术方面,特别是对药物化学应用感兴趣的在线生物医学命名实体识别系统。
26个注册团队中有15个成功实现了在线注释服务器。它们在两个月的时间内以预定义格式返回预测结果,并通过专门为此赛道开发的BeCalm评估平台进行评估。该赛道涵盖三个评估级别,即数据格式考量、技术指标和功能规格。参与的注释服务器用七种不同的编程语言实现,涵盖12种通用实体类型。对服务器响应的持续评估考虑了低活动期以及中等到高活动期的测试阶段,总共包括来自三种不同文档提供设置的4,092,502个请求。中位响应时间低于3.74秒,每份文档的注释中位数为10条。大多数服务器显示出很高的可靠性和稳定性,能够在5天内处理超过100,000个请求。
所展示的赛道是一项新颖的实验任务,系统地评估了在线实体识别系统的技术性能方面。它引起了众多参与者的兴趣。未来的竞赛版本将解决批量处理文档以及注释全文文档的能力问题。