文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

将来自PubTator的人工智能文本挖掘技术整合到比较毒理基因组学数据库的人工编目工作流程中。

Integrating AI-powered text mining from PubTator into the manual curation workflow at the Comparative Toxicogenomics Database.

作者信息

Wiegers Thomas C, Davis Allan Peter, Wiegers Jolene, Sciaky Daniela, Barkalow Fern, Wyatt Brent, Strong Melissa, McMorran Roy, Abrar Sakib, Mattingly Carolyn J

机构信息

Department of Biological Sciences, North Carolina State University, Toxicology Building, 850 Main Campus Drive, Raleigh, NC 27695, USA.

Center for Human Health and the Environment, North Carolina State University, Toxicology Building, 850 Main Campus Drive, Raleigh, NC 27695, USA.

出版信息

Database (Oxford). 2025 Feb 21;2025. doi: 10.1093/database/baaf013.


DOI:10.1093/database/baaf013
PMID:39982792
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11844237/
Abstract

The Comparative Toxicogenomics Database (CTD) is a manually curated knowledge- and discovery-base that seeks to advance understanding about the relationship between environmental exposures and human health. CTD's manual curation process extracts from the biomedical literature molecular relationships between chemicals/drugs, genes/proteins, phenotypes, diseases, anatomical terms, and species. These relationships are organized in a highly systematic way in order to make them not only informative but also scientifically computational, enabling inferential hypotheses to be formed to address gaps in understanding. Integral to CTD's functionality is the use of structured, hierarchical ontologies and controlled vocabularies to describe these molecular relationships. Normalizing text (i.e. translating raw text from the literature into these controlled vocabularies) can be a time-consuming process for biocurators. To facilitate the normalization process and improve the efficiency with which our scientists curate the literature, CTD evaluated and integrated into the curation process PubTator 3.0, a state-of-the-art, AI-powered resource which extracts and normalizes from the literature many of the key biomedical concepts CTD curates. Here, we describe CTD's long-standing history with Natural Language Processing (NLP), how this history helped form our objectives for NLP integration, the evaluation of PubTator against our objectives, and the integration of PubTator into CTD's curation workflow. Database URL: https://ctdbase.org.

摘要

比较毒理基因组学数据库(CTD)是一个人工整理的知识与发现库,旨在增进对环境暴露与人类健康之间关系的理解。CTD的人工整理过程从生物医学文献中提取化学物质/药物、基因/蛋白质、表型、疾病、解剖学术语和物种之间的分子关系。这些关系以高度系统的方式组织起来,使其不仅具有信息性,而且在科学上具有可计算性,从而能够形成推理假设以填补理解上的空白。CTD功能的一个组成部分是使用结构化的、分层的本体和受控词汇表来描述这些分子关系。对于生物编目人员来说,将文本标准化(即将文献中的原始文本翻译成这些受控词汇表)可能是一个耗时的过程。为了促进标准化过程并提高我们的科学家整理文献的效率,CTD评估了PubTator 3.0并将其整合到整理过程中,PubTator 3.0是一种先进的、由人工智能驱动的资源,它从文献中提取并标准化CTD整理的许多关键生物医学概念。在这里,我们描述了CTD在自然语言处理(NLP)方面的悠久历史,这段历史如何帮助我们形成NLP整合的目标,根据我们的目标对PubTator进行评估,以及将PubTator整合到CTD的整理工作流程中。数据库网址:https://ctdbase.org。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/d8d7e9f4b09a/baaf013f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/7858f468d5bd/baaf013f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/088bfe52d0cb/baaf013f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/d7641a121562/baaf013f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/1d320ad8fd71/baaf013f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/d8d7e9f4b09a/baaf013f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/7858f468d5bd/baaf013f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/088bfe52d0cb/baaf013f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/d7641a121562/baaf013f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/1d320ad8fd71/baaf013f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/504c/11844237/d8d7e9f4b09a/baaf013f5.jpg

相似文献

[1]
Integrating AI-powered text mining from PubTator into the manual curation workflow at the Comparative Toxicogenomics Database.

Database (Oxford). 2025-2-21

[2]
Comparative Toxicogenomics Database's 20th anniversary: update 2025.

Nucleic Acids Res. 2025-1-6

[3]
Text mining effectively scores and ranks the literature for improving chemical-gene-disease curation at the comparative toxicogenomics database.

PLoS One. 2013-4-17

[4]
A CTD-Pfizer collaboration: manual curation of 88,000 scientific articles text mined for drug-disease and drug-phenotype interactions.

Database (Oxford). 2013-11-28

[5]
The curation paradigm and application tool used for manual curation of the scientific literature at the Comparative Toxicogenomics Database.

Database (Oxford). 2011-9-20

[6]
MEDIC: a practical disease vocabulary used at the Comparative Toxicogenomics Database.

Database (Oxford). 2012-3-20

[7]
Web services-based text-mining demonstrates broad impacts for interoperability and process simplification.

Database (Oxford). 2014-6-10

[8]
Targeted journal curation as a method to improve data currency at the Comparative Toxicogenomics Database.

Database (Oxford). 2012-12-6

[9]
Accelerating literature curation with text-mining tools: a case study of using PubTator to curate genes in PubMed abstracts.

Database (Oxford). 2012-11-17

[10]
Text mining and manual curation of chemical-gene-disease networks for the comparative toxicogenomics database (CTD).

BMC Bioinformatics. 2009-10-8

引用本文的文献

[1]
Network toxicology reveals collaborative mechanism of 4NQO and ethanol in esophageal squamous cell carcinogenesis.

Biochem Biophys Rep. 2025-8-2

[2]
Transcriptome combined single-cell sequencing explores molecular mechanisms of ANGPTL4 in sepsis-induced acute lung injury.

PLoS One. 2025-7-31

本文引用的文献

[1]
Comparative Toxicogenomics Database's 20th anniversary: update 2025.

Nucleic Acids Res. 2025-1-6

[2]
PubTator 3.0: an AI-powered literature resource for unlocking biomedical knowledge.

Nucleic Acids Res. 2024-7-5

[3]
CTD tetramers: a new online tool that computationally links curated chemicals, genes, phenotypes, and diseases to inform molecular mechanisms for environmental health.

Toxicol Sci. 2023-9-28

[4]
AIONER: all-in-one scheme-based biomedical named entity recognition using deep learning.

Bioinformatics. 2023-5-4

[5]
Semi-Automated Data Curation from Biomedical Literature.

AMIA Annu Symp Proc. 2022

[6]
OnTheFly: a text-mining web application for automated biomedical entity recognition, document annotation, network and functional enrichment analysis.

NAR Genom Bioinform. 2021-10-6

[7]
PubTator central: automated concept annotation for biomedical full text articles.

Nucleic Acids Res. 2019-7-2

[8]
Chemical-Induced Phenotypes at CTD Help Inform the Predisease State and Construct Adverse Outcome Pathways.

Toxicol Sci. 2018-9-1

[9]
Accessing an Expanded Exposure Science Module at the Comparative Toxicogenomics Database.

Environ Health Perspect. 2018-1-18

[10]
The 24th annual Nucleic Acids Research database issue: a look back and upcoming changes.

Nucleic Acids Res. 2017-5-19

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索