用于生物创意/化学命名实体识别挑战赛中化学和基因实体识别的Markyt可视化、预测和基准测试平台。

The Markyt visualisation, prediction and benchmark platform for chemical and gene entity recognition at BioCreative/CHEMDNER challenge.

作者信息

Pérez-Pérez Martin, Pérez-Rodríguez Gael, Rabal Obdulia, Vazquez Miguel, Oyarzabal Julen, Fdez-Riverola Florentino, Valencia Alfonso, Krallinger Martin, Lourenço Anália

机构信息

ESEI - Department of Computer Science, University of Vigo, Ourense, Spain.

Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research (CIMA), University of Navarra, Pamplona, Spain.

出版信息

Database (Oxford). 2016 Aug 19;2016. doi: 10.1093/database/baw120. Print 2016.

DOI:10.1093/database/baw120

PMID:27542845

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5001550/

Abstract

Biomedical text mining methods and technologies have improved significantly in the last decade. Considerable efforts have been invested in understanding the main challenges of biomedical literature retrieval and extraction and proposing solutions to problems of practical interest. Most notably, community-oriented initiatives such as the BioCreative challenge have enabled controlled environments for the comparison of automatic systems while pursuing practical biomedical tasks. Under this scenario, the present work describes the Markyt Web-based document curation platform, which has been implemented to support the visualisation, prediction and benchmark of chemical and gene mention annotations at BioCreative/CHEMDNER challenge. Creating this platform is an important step for the systematic and public evaluation of automatic prediction systems and the reusability of the knowledge compiled for the challenge. Markyt was not only critical to support the manual annotation and annotation revision process but also facilitated the comparative visualisation of automated results against the manually generated Gold Standard annotations and comparative assessment of generated results. We expect that future biomedical text mining challenges and the text mining community may benefit from the Markyt platform to better explore and interpret annotations and improve automatic system predictions.Database URL: http://www.markyt.org, https://github.com/sing-group/Markyt.

摘要

在过去十年中，生物医学文本挖掘方法和技术有了显著改进。人们投入了大量精力来理解生物医学文献检索与提取的主要挑战，并针对实际感兴趣的问题提出解决方案。最值得注意的是，诸如生物创意挑战赛（BioCreative challenge）这样以社区为导向的举措，在追求实际生物医学任务的同时，为自动系统的比较提供了可控环境。在这种情况下，本研究描述了基于网络的Markyt文档管理平台，该平台已被实现用于支持生物创意/化学实体识别挑战赛（BioCreative/CHEMDNER challenge）中化学物质和基因提及注释的可视化、预测和基准测试。创建这个平台对于自动预测系统的系统和公开评估以及为挑战赛汇编的知识的可重用性而言是重要的一步。Markyt不仅对于支持人工注释和注释修订过程至关重要，而且还便于将自动生成的结果与人工生成的金标准注释进行对比可视化，以及对生成结果进行对比评估。我们期望未来的生物医学文本挖掘挑战赛和文本挖掘社区能够从Markyt平台中受益，从而更好地探索和解释注释，并改进自动系统的预测。数据库网址：http://www.markyt.org，https://github.com/sing-group/Markyt 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/748e/5001550/e905938d6e2c/baw120f1p.jpg

相似文献

The Markyt visualisation, prediction and benchmark platform for chemical and gene entity recognition at BioCreative/CHEMDNER challenge.

Database (Oxford). 2016 Aug 19;2016. doi: 10.1093/database/baw120. Print 2016.

BioCreative V CDR task corpus: a resource for chemical disease relation extraction.

Database (Oxford). 2016 May 9;2016. doi: 10.1093/database/baw068. Print 2016.

The CHEMDNER corpus of chemicals and drugs and its annotation principles.

J Cheminform. 2015 Jan 19;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S2. doi: 10.1186/1758-2946-7-S1-S2. eCollection 2015.

An evaluation of GO annotation retrieval for BioCreAtIvE and GOA.

BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S17. doi: 10.1186/1471-2105-6-S1-S17. Epub 2005 May 24.

Overview of the BioCreative III Workshop.

BMC Bioinformatics. 2011 Oct 3;12 Suppl 8(Suppl 8):S1. doi: 10.1186/1471-2105-12-S8-S1.

Evaluation of text-mining systems for biology: overview of the Second BioCreative community challenge.

Genome Biol. 2008;9 Suppl 2(Suppl 2):S1. doi: 10.1186/gb-2008-9-s2-s1. Epub 2008 Sep 1.

Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task.

Database (Oxford). 2016 Mar 19;2016. doi: 10.1093/database/baw032. Print 2016.

Argo: enabling the development of bespoke workflows and services for disease annotation.

Database (Oxford). 2016 May 17;2016. doi: 10.1093/database/baw066. Print 2016.

Web services-based text-mining demonstrates broad impacts for interoperability and process simplification.

Database (Oxford). 2014 Jun 10;2014. doi: 10.1093/database/bau050. Print 2014.

Development of an information retrieval tool for biomedical patents.

Comput Methods Programs Biomed. 2018 Jun;159:125-134. doi: 10.1016/j.cmpb.2018.03.012. Epub 2018 Mar 14.

引用本文的文献

TeamTat: a collaborative text annotation tool.

Nucleic Acids Res. 2020 Jul 2;48(W1):W5-W11. doi: 10.1093/nar/gkaa333.

Semi-automated fact-checking of nucleotide sequence reagents in biomedical research publications: The Seek & Blastn tool.

PLoS One. 2019 Mar 1;14(3):e0213266. doi: 10.1371/journal.pone.0213266. eCollection 2019.

Automatic identification of relevant chemical compounds from patents.

Database (Oxford). 2019 Jan 1;2019:baz001. doi: 10.1093/database/baz001.

Collaborative relation annotation and quality analysis in Markyt environment.

Database (Oxford). 2017 Jan 1;2017. doi: 10.1093/database/bax090.

The BioC-BioGRID corpus: full text articles annotated for curation of protein-protein and genetic interactions.

Database (Oxford). 2017 Jan 10;2017. doi: 10.1093/database/baw147. Print 2017.

本文引用的文献

The CHEMDNER corpus of chemicals and drugs and its annotation principles.

J Cheminform. 2015 Jan 19;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S2. doi: 10.1186/1758-2946-7-S1-S2. eCollection 2015.

CHEMDNER: The drugs and chemical names extraction challenge.

J Cheminform. 2015 Jan 19;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S1. doi: 10.1186/1758-2946-7-S1-S1. eCollection 2015.

Application of text mining in the biomedical domain.

Methods. 2015 Mar;74:97-106. doi: 10.1016/j.ymeth.2015.01.015. Epub 2015 Jan 30.

Marky: a tool supporting annotation consistency in multi-user and iterative document annotation projects.

Comput Methods Programs Biomed. 2015 Feb;118(2):242-51. doi: 10.1016/j.cmpb.2014.11.005. Epub 2014 Nov 25.

Text mining for systems biology.

Drug Discov Today. 2014 Feb;19(2):140-4. doi: 10.1016/j.drudis.2013.09.012. Epub 2013 Sep 23.

Chapter 16: text mining for translational bioinformatics.

PLoS Comput Biol. 2013 Apr;9(4):e1003044. doi: 10.1371/journal.pcbi.1003044. Epub 2013 Apr 25.

Biocuration workflows and text mining: overview of the BioCreative 2012 Workshop Track II.

Database (Oxford). 2012 Nov 17;2012:bas043. doi: 10.1093/database/bas043. Print 2012.

The Protein-Protein Interaction tasks of BioCreative III: classification/ranking of articles and linking bio-ontology concepts to full text.

BMC Bioinformatics. 2011 Oct 3;12 Suppl 8(Suppl 8):S3. doi: 10.1186/1471-2105-12-S8-S3.

The FEBS Letters SDA corpus: a collection of protein interaction articles with high quality annotations for the BioCreative II.5 online challenge and the text mining community.

FEBS Lett. 2010 Oct 8;584(19):4129-30. doi: 10.1016/j.febslet.2010.08.026. Epub 2010 Aug 20.

Analysis of biological processes and diseases using text mining approaches.

Methods Mol Biol. 2010;593:341-82. doi: 10.1007/978-1-60327-194-3_16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于生物创意/化学命名实体识别挑战赛中化学和基因实体识别的Markyt可视化、预测和基准测试平台。

The Markyt visualisation, prediction and benchmark platform for chemical and gene entity recognition at BioCreative/CHEMDNER challenge.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献