将来自PubTator的人工智能文本挖掘技术整合到比较毒理基因组学数据库的人工编目工作流程中。

The Comparative Toxicogenomics Database (CTD) is a manually curated knowledge- and discovery-base that seeks to advance understanding about the relationship between environmental exposures and human health. CTD's manual curation process extracts from the biomedical literature molecular relationships between chemicals/drugs, genes/proteins, phenotypes, diseases, anatomical terms, and species. These relationships are organized in a highly systematic way in order to make them not only informative but also scientifically computational, enabling inferential hypotheses to be formed to address gaps in understanding. Integral to CTD's functionality is the use of structured, hierarchical ontologies and controlled vocabularies to describe these molecular relationships. Normalizing text (i.e. translating raw text from the literature into these controlled vocabularies) can be a time-consuming process for biocurators. To facilitate the normalization process and improve the efficiency with which our scientists curate the literature, CTD evaluated and integrated into the curation process PubTator 3.0, a state-of-the-art, AI-powered resource which extracts and normalizes from the literature many of the key biomedical concepts CTD curates. Here, we describe CTD's long-standing history with Natural Language Processing (NLP), how this history helped form our objectives for NLP integration, the evaluation of PubTator against our objectives, and the integration of PubTator into CTD's curation workflow. Database URL: https://ctdbase.org.

比较毒理基因组学数据库（CTD）是一个人工整理的知识与发现库，旨在增进对环境暴露与人类健康之间关系的理解。CTD的人工整理过程从生物医学文献中提取化学物质/药物、基因/蛋白质、表型、疾病、解剖学术语和物种之间的分子关系。这些关系以高度系统的方式组织起来，使其不仅具有信息性，而且在科学上具有可计算性，从而能够形成推理假设以填补理解上的空白。CTD功能的一个组成部分是使用结构化的、分层的本体和受控词汇表来描述这些分子关系。对于生物编目人员来说，将文本标准化（即将文献中的原始文本翻译成这些受控词汇表）可能是一个耗时的过程。为了促进标准化过程并提高我们的科学家整理文献的效率，CTD评估了PubTator 3.0并将其整合到整理过程中，PubTator 3.0是一种先进的、由人工智能驱动的资源，它从文献中提取并标准化CTD整理的许多关键生物医学概念。在这里，我们描述了CTD在自然语言处理（NLP）方面的悠久历史，这段历史如何帮助我们形成NLP整合的目标，根据我们的目标对PubTator进行评估，以及将PubTator整合到CTD的整理工作流程中。数据库网址：https://ctdbase.org。

新学期，新优惠

Suppr 超能文献

新学期，新优惠

Suppr 超能文献

Integrating AI-powered text mining from PubTator into the manual curation workflow at the Comparative Toxicogenomics Database.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

推荐工具