Wyatt Brent, Davis Allan Peter, Wiegers Thomas C, Wiegers Jolene, Abrar Sakib, Sciaky Daniela, Barkalow Fern, Strong Melissa, Mattingly Carolyn J
Department of Biological Sciences, North Carolina State University, Raleigh, NC, United States.
Center for Human Health and the Environment, North Carolina State University, Raleigh, NC, United States.
Front Toxicol. 2024 Jul 22;6:1437884. doi: 10.3389/ftox.2024.1437884. eCollection 2024.
In environmental health, the specific molecular mechanisms connecting a chemical exposure to an adverse endpoint are often unknown, reflecting knowledge gaps. At the public Comparative Toxicogenomics Database (CTD; https://ctdbase.org/), we integrate manually curated, literature-based interactions from CTD to compute four-unit blocks of information organized as a potential step-wise molecular mechanism, known as "CGPD-tetramers," wherein a chemical interacts with a gene product to trigger a phenotype which can be linked to a disease. These computationally derived datasets can be used to fill the gaps and offer testable mechanistic information. Users can generate CGPD-tetramers for any combination of chemical, gene, phenotype, and/or disease of interest at CTD; however, such queries typically result in the generation of thousands of CGPD-tetramers. Here, we describe a novel approach to transform these large datasets into user-friendly chord diagrams using R. This visualization process is straightforward, simple to implement, and accessible to inexperienced users that have never used R before. Combining CGPD-tetramers into a single chord diagram helps identify potential key chemicals, genes, phenotypes, and diseases. This visualization allows users to more readily analyze computational datasets that can fill the exposure knowledge gaps in the environmental health continuum.
在环境卫生领域,将化学物质暴露与不良终点联系起来的具体分子机制往往尚不明确,这反映了知识上的空白。在公共的比较毒理基因组学数据库(CTD;https://ctdbase.org/)中,我们整合了来自CTD的人工整理的、基于文献的相互作用,以计算出作为潜在逐步分子机制组织的四单元信息块,即“CGPD-四聚体”,其中一种化学物质与一种基因产物相互作用以引发一种可与一种疾病相关联的表型。这些通过计算得出的数据集可用于填补空白并提供可测试的机制信息。用户可以在CTD上针对感兴趣的任何化学物质、基因、表型和/或疾病组合生成CGPD-四聚体;然而,此类查询通常会生成数千个CGPD-四聚体。在这里,我们描述了一种使用R将这些大型数据集转换为用户友好的弦图的新方法。这个可视化过程简单直接、易于实现,并且对于从未使用过R的新手用户来说也可操作。将CGPD-四聚体组合成单个弦图有助于识别潜在的关键化学物质、基因、表型和疾病。这种可视化使用户能够更轻松地分析可填补环境卫生连续体中暴露知识空白的计算数据集。