Suppr超能文献

TeamTat:一个协作文本注释工具。

TeamTat: a collaborative text annotation tool.

机构信息

National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD 20894, USA.

School of Software Convergence, Myongji University, Seoul 03674, South Korea.

出版信息

Nucleic Acids Res. 2020 Jul 2;48(W1):W5-W11. doi: 10.1093/nar/gkaa333.

Abstract

Manually annotated data is key to developing text-mining and information-extraction algorithms. However, human annotation requires considerable time, effort and expertise. Given the rapid growth of biomedical literature, it is paramount to build tools that facilitate speed and maintain expert quality. While existing text annotation tools may provide user-friendly interfaces to domain experts, limited support is available for figure display, project management, and multi-user team annotation. In response, we developed TeamTat (https://www.teamtat.org), a web-based annotation tool (local setup available), equipped to manage team annotation projects engagingly and efficiently. TeamTat is a novel tool for managing multi-user, multi-label document annotation, reflecting the entire production life cycle. Project managers can specify annotation schema for entities and relations and select annotator(s) and distribute documents anonymously to prevent bias. Document input format can be plain text, PDF or BioC (uploaded locally or automatically retrieved from PubMed/PMC), and output format is BioC with inline annotations. TeamTat displays figures from the full text for the annotator's convenience. Multiple users can work on the same document independently in their workspaces, and the team manager can track task completion. TeamTat provides corpus quality assessment via inter-annotator agreement statistics, and a user-friendly interface convenient for annotation review and inter-annotator disagreement resolution to improve corpus quality.

摘要

人工标注数据对于开发文本挖掘和信息提取算法至关重要。然而,人工标注需要大量的时间、精力和专业知识。鉴于生物医学文献的快速增长,构建有助于提高速度并保持专家质量的工具至关重要。虽然现有的文本标注工具可能为领域专家提供用户友好的界面,但对于图形显示、项目管理和多用户团队标注的支持有限。有鉴于此,我们开发了 TeamTat(https://www.teamtat.org),这是一个基于网络的标注工具(可本地设置),能够吸引人且高效地管理团队标注项目。TeamTat 是一种用于管理多用户、多标签文档标注的新型工具,反映了整个生产生命周期。项目经理可以为实体和关系指定标注方案,并选择标注人员并匿名分发文档,以防止偏见。文档输入格式可以是纯文本、PDF 或 BioC(本地上传或自动从 PubMed/PMC 检索),输出格式为带有内联标注的 BioC。TeamTat 为标注人员方便地显示全文中的图形。多个用户可以在其工作区中独立处理同一文档,团队经理可以跟踪任务完成情况。TeamTat 通过注释者间的一致性统计数据提供语料库质量评估,并提供用户友好的界面,方便注释审查和注释者间分歧解决,以提高语料库质量。

相似文献

1
TeamTat: a collaborative text annotation tool.TeamTat:一个协作文本注释工具。
Nucleic Acids Res. 2020 Jul 2;48(W1):W5-W11. doi: 10.1093/nar/gkaa333.
8

引用本文的文献

1
Enhancing biomedical relation extraction with directionality.通过方向性增强生物医学关系提取
Bioinformatics. 2025 Jul 1;41(Supplement_1):i68-i76. doi: 10.1093/bioinformatics/btaf226.
2
PDF Entity Annotation Tool (PEAT).PDF实体注释工具(PEAT)。
J Open Source Softw. 2025 Apr 8;10(108):5336. doi: 10.21105/joss.05336.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验