Davis Allan Peter, Grondin Cynthia J, Lennon-Hopkins Kelley, Saraceni-Richards Cynthia, Sciaky Daniela, King Benjamin L, Wiegers Thomas C, Mattingly Carolyn J
Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA
Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695-7617, USA.
Nucleic Acids Res. 2015 Jan;43(Database issue):D914-20. doi: 10.1093/nar/gku935. Epub 2014 Oct 17.
Ten years ago, the Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) was developed out of a need to formalize, harmonize and centralize the information on numerous genes and proteins responding to environmental toxic agents across diverse species. CTD's initial approach was to facilitate comparisons of nucleotide and protein sequences of toxicologically significant genes by curating these sequences and electronically annotating them with chemical terms from their associated references. Since then, however, CTD has vastly expanded its scope to robustly represent a triad of chemical-gene, chemical-disease and gene-disease interactions that are manually curated from the scientific literature by professional biocurators using controlled vocabularies, ontologies and structured notation. Today, CTD includes 24 million toxicogenomic connections relating chemicals/drugs, genes/proteins, diseases, taxa, phenotypes, Gene Ontology annotations, pathways and interaction modules. In this 10th year anniversary update, we outline the evolution of CTD, including our increased data content, new 'Pathway View' visualization tool, enhanced curation practices, pilot chemical-phenotype results and impending exposure data set. The prototype database originally described in our first report has transformed into a sophisticated resource used actively today to help scientists develop and test hypotheses about the etiologies of environmentally influenced diseases.
十年前,出于对整理、协调和集中不同物种中众多对环境毒物产生反应的基因和蛋白质信息的需求,比较毒理基因组学数据库(CTD;http://ctdbase.org/)应运而生。CTD最初的方法是通过整理毒理学重要基因的核苷酸和蛋白质序列,并利用相关参考文献中的化学术语对其进行电子注释,来促进这些序列的比较。然而,从那时起,CTD大幅扩展了其范围,以有力地呈现化学物质-基因、化学物质-疾病和基因-疾病相互作用的三元组,这些相互作用由专业生物编目员使用受控词汇表、本体和结构化表示法从科学文献中进行人工编目。如今,CTD包含2400万个毒理基因组学关联,涉及化学物质/药物、基因/蛋白质、疾病、分类群、表型、基因本体注释、通路和相互作用模块。在本次十周年更新中,我们概述了CTD的发展历程,包括我们增加的数据内容、新的“通路视图”可视化工具、改进的编目实践、初步的化学物质-表型结果以及即将推出的暴露数据集。我们在第一篇报告中最初描述的原型数据库已转变为如今被积极使用的精密资源,以帮助科学家提出和检验有关受环境影响疾病病因的假设。