Department of Biological Sciences.
Center for Human Health and the Environment, North Carolina State University, Raleigh, North Carolina 27695.
Toxicol Sci. 2020 Oct 1;177(2):392-404. doi: 10.1093/toxsci/kfaa113.
Environmental health studies relate how exposures (eg, chemicals) affect human health and disease; however, in most cases, the molecular and biological mechanisms connecting an exposure with a disease remain unknown. To help fill in these knowledge gaps, we sought to leverage content from the public Comparative Toxicogenomics Database (CTD) to identify potential intermediary steps. In a proof-of-concept study, we systematically compute the genes, molecular mechanisms, and biological events for the environmental health association linking air pollution toxicants with 2 cardiovascular diseases (myocardial infarction and hypertension) as a test case. Our approach integrates 5 types of curated interactions in CTD to build sets of "CGPD-tetramers," computationally constructed information blocks relating a Chemical- Gene interaction with a Phenotype and Disease. This bioinformatics strategy generates 653 CGPD-tetramers for air pollution-associated myocardial infarction (involving 5 pollutants, 58 genes, and 117 phenotypes) and 701 CGPD-tetramers for air pollution-associated hypertension (involving 3 pollutants, 96 genes, and 142 phenotypes). Collectively, we identify 19 genes and 96 phenotypes shared between these 2 air pollutant-induced outcomes, and suggest important roles for oxidative stress, inflammation, immune responses, cell death, and circulatory system processes. Moreover, CGPD-tetramers can be assembled into extensive chemical-induced disease pathways involving multiple gene products and sequential biological events, and many of these computed intermediary steps are validated in the literature. Our method does not require a priori knowledge of the toxicant, interacting gene, or biological system, and can be used to analyze any environmental chemical-induced disease curated within the public CTD framework. This bioinformatics strategy links and interrelates chemicals, genes, phenotypes, and diseases to fill in knowledge gaps for environmental health studies, as demonstrated for air pollution-associated cardiovascular disease, but can be adapted by researchers for any environmentally influenced disease-of-interest.
环境健康研究探讨暴露(例如化学物质)如何影响人类健康和疾病;然而,在大多数情况下,将暴露与疾病联系起来的分子和生物学机制仍然未知。为了帮助填补这些知识空白,我们试图利用公共比较毒理学基因组数据库(CTD)中的内容来识别潜在的中间步骤。在概念验证研究中,我们系统地计算了将空气污染毒物与 2 种心血管疾病(心肌梗死和高血压)联系起来的环境健康关联的基因、分子机制和生物事件,作为一个测试案例。我们的方法整合了 CTD 中的 5 种 curated 相互作用,构建了“CGPD-四聚体”的集合,这些信息块是将化学物质-基因相互作用与表型和疾病联系起来的计算构建信息块。这种生物信息学策略为与空气污染相关的心肌梗死(涉及 5 种污染物、58 个基因和 117 个表型)生成了 653 个 CGPD-四聚体,为与空气污染相关的高血压(涉及 3 种污染物、96 个基因和 142 个表型)生成了 701 个 CGPD-四聚体。总的来说,我们在这 2 种空气污染物诱导的结果之间确定了 19 个基因和 96 个表型,并且表明氧化应激、炎症、免疫反应、细胞死亡和循环系统过程的重要作用。此外,CGPD-四聚体可以组装成涉及多个基因产物和连续生物事件的广泛化学诱导疾病途径,并且许多计算的中间步骤在文献中得到验证。我们的方法不需要事先了解毒物、相互作用基因或生物系统,并且可以用于分析公共 CTD 框架中 curated 的任何环境化学诱导疾病。这种生物信息学策略将化学物质、基因、表型和疾病联系起来并相互关联,以填补环境健康研究中的知识空白,如与空气污染相关的心血管疾病所示,但可以由研究人员根据任何受环境影响的疾病进行调整。