Davis Allan P, Murphy Cynthia G, Rosenstein Michael C, Wiegers Thomas C, Mattingly Carolyn J
Department of Bioinformatics, The Mount Desert Island Biological Laboratory, Salisbury Cove, Maine 04672 USA.
BMC Med Genomics. 2008 Oct 9;1:48. doi: 10.1186/1755-8794-1-48.
The etiology of many chronic diseases involves interactions between environmental factors and genes that modulate physiological processes. Understanding interactions between environmental chemicals and genes/proteins may provide insights into the mechanisms of chemical actions, disease susceptibility, toxicity, and therapeutic drug interactions. The Comparative Toxicogenomics Database (CTD; http://ctd.mdibl.org) provides these insights by curating and integrating data describing relationships between chemicals, genes/proteins, and human diseases. To illustrate the scope and application of CTD, we present an analysis of curated data for the chemical arsenic. Arsenic represents a major global environmental health threat and is associated with many diseases. The mechanisms by which arsenic modulates these diseases are not well understood.
Curated interactions between arsenic compounds and genes were downloaded using export and batch query tools at CTD. The list of genes was analyzed for molecular interactions, Gene Ontology (GO) terms, KEGG pathway annotations, and inferred disease relationships.
CTD contains curated data from the published literature describing 2,738 molecular interactions between 21 different arsenic compounds and 1,456 genes and proteins. Analysis of these genes and proteins provide insight into the biological functions and molecular networks that are affected by exposure to arsenic, including stress response, apoptosis, cell cycle, and specific protein signaling pathways. Integrating arsenic-gene data with gene-disease data yields a list of diseases that may be associated with arsenic exposure and genes that may explain this association.
CTD data integration and curation strategies yield insight into the actions of environmental chemicals and provide a basis for developing hypotheses about the molecular mechanisms underlying the etiology of environmental diseases. While many reports describe the molecular response to arsenic, CTD integrates these data with additional curated data sets that facilitate construction of chemical-gene-disease networks and provide the groundwork for investigating the molecular basis of arsenic-associated diseases or toxicity. The analysis reported here is extensible to any environmental chemical or therapeutic drug.
许多慢性疾病的病因涉及环境因素与调节生理过程的基因之间的相互作用。了解环境化学物质与基因/蛋白质之间的相互作用,可能有助于深入了解化学物质的作用机制、疾病易感性、毒性以及治疗药物相互作用。比较毒理基因组学数据库(CTD;http://ctd.mdibl.org)通过整理和整合描述化学物质、基因/蛋白质与人类疾病之间关系的数据,提供了这些见解。为了说明CTD的范围和应用,我们对整理的化学物质砷的数据进行了分析。砷是全球主要的环境健康威胁之一,与许多疾病相关。砷调节这些疾病的机制尚不清楚。
使用CTD的导出和批量查询工具下载砷化合物与基因之间整理的相互作用。对基因列表进行分子相互作用、基因本体(GO)术语、KEGG通路注释以及推断的疾病关系分析。
CTD包含来自已发表文献的整理数据,描述了21种不同砷化合物与1456个基因和蛋白质之间的2738种分子相互作用。对这些基因和蛋白质的分析有助于深入了解暴露于砷所影响的生物学功能和分子网络,包括应激反应、细胞凋亡、细胞周期以及特定的蛋白质信号通路。将砷-基因数据与基因-疾病数据整合,得出一份可能与砷暴露相关的疾病列表以及可能解释这种关联的基因列表。
CTD的数据整合和整理策略有助于深入了解环境化学物质的作用,并为提出关于环境疾病病因的分子机制假设提供依据。虽然许多报告描述了对砷的分子反应,但CTD将这些数据与其他整理数据集整合,有助于构建化学-基因-疾病网络,并为研究砷相关疾病或毒性的分子基础奠定基础。此处报告的分析可扩展到任何环境化学物质或治疗药物。