Wang Shan-Shan, Wang Chia-Chi, Wang Chien-Lun, Lin Ying-Chi, Tung Chun-Wei
Ph.D. Program in Environmental and Occupational Medicine, College of Medicine, Kaohsiung Medical University and National Health Research Institutes, Kaohsiung 80708, Taiwan.
Institute of Biotechnology and Pharmaceutical Research, National Health Research Institutes, Miaoli County 35053, Taiwan.
J Xenobiot. 2024 Jul 31;14(3):1023-1035. doi: 10.3390/jox14030057.
In silico toxicogenomics methods are resource- and time-efficient approaches for inferring chemical-protein-disease associations with potential mechanism information for exploring toxicological effects. However, current in silico toxicogenomics systems make inferences based on only chemical-protein interactions without considering tissue-specific gene/protein expressions. As a result, inferred diseases could be overpredicted with false positives. In this work, six tissue-specific expression datasets of genes and proteins were collected from the Expression Atlas. Genes were then categorized into high, medium, and low expression levels in a tissue- and dataset-specific manner. Subsequently, the tissue-specific expression datasets were incorporated into the chemical-protein-disease inference process of our ChemDIS system by filtering out relatively low-expressed genes. By incorporating tissue-specific gene/protein expression data, the enrichment rate for chemical-disease inference was largely improved with up to 62.26% improvement. A case study of melamine showed the ability of the proposed method to identify more specific disease terms that are consistent with the literature. A user-friendly user interface was implemented in the ChemDIS system. The methodology is expected to be useful for chemical-disease inference and can be implemented for other in silico toxicogenomics tools.
计算机毒理基因组学方法是一种资源和时间高效的方法,用于推断化学物质-蛋白质-疾病关联,并带有潜在机制信息以探索毒理学效应。然而,当前的计算机毒理基因组学系统仅基于化学物质-蛋白质相互作用进行推断,而未考虑组织特异性基因/蛋白质表达。因此,推断出的疾病可能会因假阳性而被过度预测。在这项工作中,从表达图谱中收集了六个基因和蛋白质的组织特异性表达数据集。然后,以组织和数据集特异性的方式将基因分为高、中、低表达水平。随后,通过滤除相对低表达的基因,将组织特异性表达数据集纳入我们的ChemDIS系统的化学物质-蛋白质-疾病推断过程。通过纳入组织特异性基因/蛋白质表达数据,化学物质-疾病推断的富集率得到了很大提高,提高幅度高达62.26%。三聚氰胺的案例研究表明,所提出的方法能够识别出与文献一致的更具体的疾病术语。在ChemDIS系统中实现了用户友好的用户界面。该方法有望用于化学物质-疾病推断,并可应用于其他计算机毒理基因组学工具。