Romano Joseph D, Mei Liang, Senn Jonathan, Moore Jason H, Mortensen Holly M
Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, United States.
Center of Excellence in Environmental Toxicology, University of Pennsylvania, Philadelphia, PA, United States.
Comput Toxicol. 2023 Feb;25. doi: 10.1016/j.comtox.2023.100261. Epub 2023 Jan 25.
Adverse outcome pathways provide a powerful tool for understanding the biological signaling cascades that lead to disease outcomes following toxicity. The framework outlines downstream responses known as key events, culminating in a clinically significant adverse outcome as a final result of the toxic exposure. Here we use the AOP framework combined with artificial intelligence methods to gain novel insights into genetic mechanisms that underlie toxicity-mediated adverse health outcomes. Specifically, we focus on liver cancer as a case study with diverse underlying mechanisms that are clinically significant. Our approach uses two complementary AI techniques: Generative modeling via automated machine learning and genetic algorithms, and graph machine learning. We used data from the US Environmental Protection Agency's Adverse Outcome Pathway Database (AOP-DB; aopdb.epa.gov) and the UK Biobank's genetic data repository. We use the AOP-DB to extract disease-specific AOPs and build graph neural networks used in our final analyses. We use the UK Biobank to retrieve real-world genotype and phenotype data, where genotypes are based on single nucleotide polymorphism data extracted from the AOP-DB, and phenotypes are case/control cohorts for the disease of interest (liver cancer) corresponding to those adverse outcome pathways. We also use propensity score matching to appropriately sample based on important covariates (demographics, comorbidities, and social deprivation indices) and to balance the case and control populations in our machine language training/testing datasets. Finally, we describe a novel putative risk factor for LC that depends on genetic variation in both the aryl-hydrocarbon receptor () and ATP binding cassette subfamily B member 11 () genes.
不良结局途径为理解毒性作用后导致疾病结局的生物信号级联反应提供了一个强大的工具。该框架概述了被称为关键事件的下游反应,最终导致临床上显著的不良结局,这是毒性暴露的最终结果。在这里,我们使用不良结局途径框架结合人工智能方法,以获得对毒性介导的不良健康结局背后的遗传机制的新见解。具体来说,我们以肝癌为例进行研究,肝癌具有多种临床上显著的潜在机制。我们的方法使用了两种互补的人工智能技术:通过自动机器学习和遗传算法进行生成建模,以及图机器学习。我们使用了美国环境保护局不良结局途径数据库(AOP-DB;aopdb.epa.gov)的数据和英国生物银行的遗传数据存储库。我们使用AOP-DB提取特定疾病的不良结局途径,并构建用于我们最终分析的图神经网络。我们使用英国生物银行检索现实世界的基因型和表型数据,其中基因型基于从AOP-DB中提取的单核苷酸多态性数据,表型是与那些不良结局途径相对应得感兴趣疾病(肝癌)的病例/对照队列。我们还使用倾向得分匹配,根据重要协变量(人口统计学、合并症和社会剥夺指数)进行适当抽样,并在我们的机器学习训练/测试数据集中平衡病例和对照人群。最后,我们描述了一种新的肝癌假定危险因素,它取决于芳烃受体()和ATP结合盒亚家族B成员11()基因的遗传变异。