Suvorov Alexander, Salemme Victoria, McGaunn Joseph, Poluyanoff Anthony, Amir Saira
Department of Environmental Health Sciences, School of Public Health and Health Sciences, University of Massachusetts, 173B-Goessmann, 686 North Pleasant Street, Amherst, MA 01003, USA.
Department of Biosciences, COMSATS University Islamabad, Pakistan.
Data Brief. 2020 Oct 9;33:106398. doi: 10.1016/j.dib.2020.106398. eCollection 2020 Dec.
A dataset of chemical-gene interactions was created by extracting data from the Comparative Toxicogenomics Database (CTD) with the following filtering criteria: data was extracted only from experiments that used human, rat, or mouse cells/tissues and used high-throughput approaches for gene expression analysis. Genes not present in genomes of all three species were filtered out. The resulting dataset included 591,084 chemical-gene interaction. All chemical compounds in the database were annotated for their major uses. For every gene in the database number of chemical-gene interactions was calculated and used as a metric of gene sensitivity to a variety of chemical exposures. The lists of genes with corresponding numbers of chemical-gene interactions were used in gene-set enrichment analysis (GSEA) to identify potential sensitivity to chemical exposures of molecular pathways in Hallmark, KEGG and Reactome collections. Thus, data presented here represent unbiased and searchable datasets of sensitivity of genes and molecular pathways to a broad range of chemical exposures. As such the data can be used for a diverse range of toxicological and regulatory applications. Approach for the identification of molecular mechanisms sensitive to chemical exposures may inform regulatory toxicology about best toxicity testing strategies. Analysis of sensitivity of genes and molecular pathways to chemical exposures based on these datasets was published in Chemosphere (Suvorov et al., 2021) [1].
通过从比较毒理基因组学数据库(CTD)中提取数据,并采用以下筛选标准,创建了一个化学-基因相互作用数据集:仅从使用人类、大鼠或小鼠细胞/组织且采用高通量方法进行基因表达分析的实验中提取数据。不在所有这三个物种基因组中的基因被过滤掉。所得数据集包含591,084个化学-基因相互作用。数据库中的所有化合物都标注了其主要用途。对于数据库中的每个基因,计算化学-基因相互作用的数量,并将其用作基因对各种化学暴露敏感性的指标。具有相应化学-基因相互作用数量的基因列表用于基因集富集分析(GSEA),以确定在标志性通路、KEGG和Reactome集合中分子通路对化学暴露的潜在敏感性。因此,此处呈现的数据代表了基因和分子通路对广泛化学暴露敏感性的无偏且可搜索的数据集。因此,这些数据可用于各种毒理学和监管应用。识别对化学暴露敏感的分子机制的方法可能会为监管毒理学提供最佳毒性测试策略的信息。基于这些数据集对基因和分子通路对化学暴露的敏感性分析发表在《环境科学与技术》(Suvorov等人,2021年)[1]上。