用于前K个基因相互作用发现的知识图谱辅助贝叶斯主动学习

Knowledge graph-aided Bayesian active learning for top-K genetic interaction discovery.

作者信息

Soper Braden, Lisicki Michal, Silva Mary, Cadena Jose, Zhu Haonan, Sundaram Shivshankar, Ray Priyadip, Drocco Jeff

机构信息

Lawrence Livermore National Laboratory, 7000 East Ave, Livermore, CA, 94550, USA.

School of Engineering, University of Guelph, Guelph, ON, N1G 2W1, Canada.

出版信息

Sci Rep. 2025 Aug 25;15(1):31196. doi: 10.1038/s41598-025-13972-7.

DOI:10.1038/s41598-025-13972-7

PMID:40854903

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12379194/

Abstract

In silico methods for predicting the effects of multi-gene perturbations hold great promise for advancing functional genomics, computational drug discovery, and disease modeling. However, the development of these predictive algorithms for mammalian systems has been hampered by limited datasets and high experimental costs. In this study, we present a Bayesian active learning framework designed to discover pairwise host gene knockdowns that effectively inhibit viral proliferation in an in vitro HIV-1 infection model. Our method leverages a biological knowledge graph as side information and employs a computationally efficient batch diversification approach. We evaluated this framework using a dataset of viral load measurements obtained from multi-day dual-gene depletion experiments, encompassing all possible pairwise knockdowns of over 350 host genes associated with HIV infection. We demonstrate that our framework rapidly identifies the most effective gene knockdown pairs for reducing viral load. Furthermore, we show that incorporating side information enhances performance during the early stages of active learning (low data regime), while our batch diversification strategy significantly boosts performance in later stages (high data regime). This framework is general and can be adapted to explore gene interactions in other contexts, such as synthetic lethality prediction and mapping epistatic effects across quantitative trait loci.

摘要

用于预测多基因扰动效应的计算机模拟方法在推进功能基因组学、计算药物发现和疾病建模方面具有巨大潜力。然而，用于哺乳动物系统的这些预测算法的开发受到数据集有限和实验成本高昂的阻碍。在本研究中，我们提出了一种贝叶斯主动学习框架，旨在发现能在体外HIV-1感染模型中有效抑制病毒增殖的成对宿主基因敲低组合。我们的方法利用生物知识图谱作为辅助信息，并采用计算效率高的批量多样化方法。我们使用从多日双基因敲除实验获得的病毒载量测量数据集评估了该框架，该数据集涵盖了与HIV感染相关的350多个宿主基因的所有可能成对敲除组合。我们证明我们的框架能快速识别出降低病毒载量最有效的基因敲低对。此外，我们表明纳入辅助信息可在主动学习的早期阶段（低数据量阶段）提高性能，而我们的批量多样化策略在后期阶段（高数据量阶段）能显著提升性能。该框架具有通用性，可适用于探索其他背景下的基因相互作用，如合成致死预测以及跨数量性状基因座绘制上位性效应。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9a6/12379194/59ef2d1c682d/41598_2025_13972_Fig1_HTML.jpg

相似文献

Knowledge graph-aided Bayesian active learning for top-K genetic interaction discovery.用于前K个基因相互作用发现的知识图谱辅助贝叶斯主动学习

Sci Rep. 2025 Aug 25;15(1):31196. doi: 10.1038/s41598-025-13972-7.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Short-Term Memory Impairment短期记忆障碍

The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

ScITree: Scalable Bayesian inference of transmission tree from epidemiological and genomic data.ScITree：从流行病学和基因组数据中对传播树进行可扩展的贝叶斯推断。

PLoS Comput Biol. 2025 Jun 10;21(6):e1012657. doi: 10.1371/journal.pcbi.1012657. eCollection 2025 Jun.

Structured treatment interruptions (STI) in chronic unsuppressed HIV infection in adults.成人慢性未抑制的HIV感染中的结构化治疗中断（STI）

Cochrane Database Syst Rev. 2006 Jul 19;2006(3):CD006148. doi: 10.1002/14651858.CD006148.

Plug-and-play use of tree-based methods: consequences for clinical prediction modeling.基于树的方法的即插即用：对临床预测模型的影响。

J Clin Epidemiol. 2025 Aug;184:111834. doi: 10.1016/j.jclinepi.2025.111834. Epub 2025 May 19.

Quality improvement strategies for diabetes care: Effects on outcomes for adults living with diabetes.糖尿病护理质量改进策略：对成年糖尿病患者结局的影响。

Cochrane Database Syst Rev. 2023 May 31;5(5):CD014513. doi: 10.1002/14651858.CD014513.

Genetic determinants of testicular sperm extraction outcomes: insights from a large multicentre study of men with non-obstructive azoospermia.睾丸精子提取结果的遗传决定因素：来自一项针对非梗阻性无精子症男性的大型多中心研究的见解

Hum Reprod Open. 2025 Aug 29;2025(3):hoaf049. doi: 10.1093/hropen/hoaf049. eCollection 2025.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病：网络荟萃分析。

Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

本文引用的文献

Deep-learning-based gene perturbation effect prediction does not yet outperform simple linear baselines.基于深度学习的基因扰动效应预测尚未超越简单的线性基线。

Nat Methods. 2025 Aug;22(8):1657-1661. doi: 10.1038/s41592-025-02772-6. Epub 2025 Aug 4.

Incorporating graph information in Bayesian factor analysis with robust and adaptive shrinkage priors.在具有稳健和自适应收缩先验的贝叶斯因子分析中纳入图信息。

Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujad014.

Predicting transcriptional outcomes of novel multigene perturbations with GEARS.用 GEARS 预测新型多基因扰动的转录结果。

Nat Biotechnol. 2024 Jun;42(6):927-935. doi: 10.1038/s41587-023-01905-6. Epub 2023 Aug 17.

Multi-cohort analysis of host immune response identifies conserved protective and detrimental modules associated with severity across viruses.多队列宿主免疫反应分析鉴定出与多种病毒严重程度相关的保守保护性和损伤性模块。

Immunity. 2021 Apr 13;54(4):753-768.e5. doi: 10.1016/j.immuni.2021.03.002. Epub 2021 Mar 24.

Matrix (factorization) reloaded: flexible methods for imputing genetic interactions with cross-species and side information.矩阵（因子分解）再思考：使用跨物种和辅助信息进行遗传交互作用推断的灵活方法。

Bioinformatics. 2020 Dec 30;36(Suppl_2):i866-i874. doi: 10.1093/bioinformatics/btaa818.

Machine Learning Methods in Drug Discovery.药物发现中的机器学习方法。

Molecules. 2020 Nov 12;25(22):5277. doi: 10.3390/molecules25225277.

Comparative host-coronavirus protein interaction networks reveal pan-viral disease mechanisms.比较宿主-冠状病毒蛋白相互作用网络揭示泛病毒疾病机制。

Science. 2020 Dec 4;370(6521). doi: 10.1126/science.abe9403. Epub 2020 Oct 15.

Kernelized Sparse Bayesian Matrix Factorization.核化稀疏贝叶斯矩阵分解

IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):391-404. doi: 10.1109/TNNLS.2020.2978761. Epub 2021 Jan 4.

A Quantitative Genetic Interaction Map of HIV Infection.HIV 感染的定量遗传互作图谱

Mol Cell. 2020 Apr 16;78(2):197-209.e7. doi: 10.1016/j.molcel.2020.02.004. Epub 2020 Feb 20.

netNMF-sc: leveraging gene-gene interactions for imputation and dimensionality reduction in single-cell expression analysis.netNMF-sc：利用基因-基因相互作用进行单细胞表达分析中的推断和降维。

Genome Res. 2020 Feb;30(2):195-204. doi: 10.1101/gr.251603.119. Epub 2020 Jan 28.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于前K个基因相互作用发现的知识图谱辅助贝叶斯主动学习

Knowledge graph-aided Bayesian active learning for top-K genetic interaction discovery.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献