• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

全基因组规模上的分布式贝叶斯网络重建

Distributed Bayesian networks reconstruction on the whole genome scale.

作者信息

Frolova Alina, Wilczyński Bartek

机构信息

Institute of Molecular Biology and Genetics, Kyiv, Ukraine.

Institute of Informatics, University of Warsaw, Warsaw, Poland.

出版信息

PeerJ. 2018 Oct 19;6:e5692. doi: 10.7717/peerj.5692. eCollection 2018.

DOI:10.7717/peerj.5692
PMID:30364537
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6197044/
Abstract

BACKGROUND

Bayesian networks are directed acyclic graphical models widely used to represent the probabilistic relationships between random variables. They have been applied in various biological contexts, including gene regulatory networks and protein-protein interactions inference. Generally, learning Bayesian networks from experimental data is NP-hard, leading to widespread use of heuristic search methods giving suboptimal results. However, in cases when the acyclicity of the graph can be externally ensured, it is possible to find the optimal network in polynomial time. While our previously developed tool BNFinder implements polynomial time algorithm, reconstructing networks with the large amount of experimental data still leads to computations on single CPU growing exceedingly.

RESULTS

In the present paper we propose parallelized algorithm designed for multi-core and distributed systems and its implementation in the improved version of BNFinder-tool for learning optimal Bayesian networks. The new algorithm has been tested on different simulated and experimental datasets showing that it has much better efficiency of parallelization than the previous version. BNFinder gives comparable results in terms of accuracy with respect to current state-of-the-art inference methods, giving significant advantage in cases when external information such as regulators list or prior edge probability can be introduced, particularly for datasets with static gene expression observations.

CONCLUSIONS

We show that the new method can be used to reconstruct networks in the size range of thousands of genes making it practically applicable to whole genome datasets of prokaryotic systems and large components of eukaryotic genomes. Our benchmarking results on realistic datasets indicate that the tool should be useful to a wide audience of researchers interested in discovering dependencies in their large-scale transcriptomic datasets.

摘要

背景

贝叶斯网络是一种有向无环图模型,广泛用于表示随机变量之间的概率关系。它们已被应用于各种生物学背景中,包括基因调控网络和蛋白质 - 蛋白质相互作用推断。一般来说,从实验数据中学习贝叶斯网络是NP难问题,这导致启发式搜索方法被广泛使用,但结果往往次优。然而,在图的无环性可以从外部确保的情况下,有可能在多项式时间内找到最优网络。虽然我们之前开发的工具BNFinder实现了多项式时间算法,但用大量实验数据重建网络仍然导致在单个CPU上的计算量急剧增加。

结果

在本文中,我们提出了一种为多核和分布式系统设计的并行算法,并在改进版的BNFinder工具中实现,用于学习最优贝叶斯网络。新算法已在不同的模拟和实验数据集上进行了测试,结果表明它比以前的版本具有更好的并行效率。在准确性方面,BNFinder与当前最先进的推断方法给出了可比的结果,在可以引入诸如调节因子列表或先验边概率等外部信息的情况下具有显著优势,特别是对于具有静态基因表达观测值的数据集。

结论

我们表明,新方法可用于重建数千个基因规模的网络,使其实际适用于原核系统的全基因组数据集以及真核基因组的大部分区域。我们在实际数据集上的基准测试结果表明,该工具对于广大有兴趣在其大规模转录组数据集中发现依赖性的研究人员应是有用的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/9c7e771029ef/peerj-06-5692-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/307d2449968a/peerj-06-5692-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/254cfe594a3a/peerj-06-5692-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/9e807d37c4fc/peerj-06-5692-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/ad937ec46a6d/peerj-06-5692-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/7dcdfcb593c2/peerj-06-5692-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/f6decc82a33b/peerj-06-5692-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/a13f16f7f723/peerj-06-5692-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/50aa0ba823af/peerj-06-5692-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/01987bb5b71a/peerj-06-5692-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/ea6e45f0ff62/peerj-06-5692-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/9c7e771029ef/peerj-06-5692-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/307d2449968a/peerj-06-5692-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/254cfe594a3a/peerj-06-5692-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/9e807d37c4fc/peerj-06-5692-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/ad937ec46a6d/peerj-06-5692-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/7dcdfcb593c2/peerj-06-5692-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/f6decc82a33b/peerj-06-5692-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/a13f16f7f723/peerj-06-5692-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/50aa0ba823af/peerj-06-5692-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/01987bb5b71a/peerj-06-5692-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/ea6e45f0ff62/peerj-06-5692-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0416/6197044/9c7e771029ef/peerj-06-5692-g011.jpg

相似文献

1
Distributed Bayesian networks reconstruction on the whole genome scale.全基因组规模上的分布式贝叶斯网络重建
PeerJ. 2018 Oct 19;6:e5692. doi: 10.7717/peerj.5692. eCollection 2018.
2
MICRAT: a novel algorithm for inferring gene regulatory networks using time series gene expression data.MICRAT:一种使用时间序列基因表达数据推断基因调控网络的新算法。
BMC Syst Biol. 2018 Dec 14;12(Suppl 7):115. doi: 10.1186/s12918-018-0635-1.
3
BNFinder: exact and efficient method for learning Bayesian networks.BNFinder:用于学习贝叶斯网络的精确且高效的方法。
Bioinformatics. 2009 Jan 15;25(2):286-7. doi: 10.1093/bioinformatics/btn505. Epub 2008 Sep 30.
4
3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics.3off2:一种基于两点和三点信息统计的网络重建算法。
BMC Bioinformatics. 2016 Jan 20;17 Suppl 2(Suppl 2):12. doi: 10.1186/s12859-015-0856-x.
5
Gene network inference using continuous time Bayesian networks: a comparative study and application to Th17 cell differentiation.使用连续时间贝叶斯网络进行基因网络推断:一项比较研究及在Th17细胞分化中的应用
BMC Bioinformatics. 2014 Dec 11;15(1):387. doi: 10.1186/s12859-014-0387-x.
6
CMIP: a software package capable of reconstructing genome-wide regulatory networks using gene expression data.CMIP:一个能够利用基因表达数据重建全基因组调控网络的软件包。
BMC Bioinformatics. 2016 Dec 23;17(Suppl 17):535. doi: 10.1186/s12859-016-1324-y.
7
H-CORE: enabling genome-scale Bayesian analysis of biological systems without prior knowledge.H-CORE:无需先验知识即可实现生物系统的全基因组规模贝叶斯分析。
Biosystems. 2007 Jul-Aug;90(1):197-210. doi: 10.1016/j.biosystems.2006.08.004. Epub 2006 Aug 22.
8
An improved Bayesian network method for reconstructing gene regulatory network based on candidate auto selection.基于候选自动选择的基因调控网络重建的改进贝叶斯网络方法。
BMC Genomics. 2017 Nov 17;18(Suppl 9):844. doi: 10.1186/s12864-017-4228-y.
9
SAGA: a hybrid search algorithm for Bayesian Network structure learning of transcriptional regulatory networks.SAGA:一种用于转录调控网络贝叶斯网络结构学习的混合搜索算法。
J Biomed Inform. 2015 Feb;53:27-35. doi: 10.1016/j.jbi.2014.08.010. Epub 2014 Aug 30.
10
Reconstruction of large-scale regulatory networks based on perturbation graphs and transitive reduction: improved methods and their evaluation.基于扰动图和传递简约的大规模调控网络重建:改进方法及其评估
BMC Syst Biol. 2013 Aug 8;7:73. doi: 10.1186/1752-0509-7-73.

引用本文的文献

1
Temporally Resolved and Interpretable Machine Learning Model of GPCR conformational transition.G蛋白偶联受体(GPCR)构象转变的时间分辨且可解释的机器学习模型
bioRxiv. 2025 Mar 17:2025.03.17.643765. doi: 10.1101/2025.03.17.643765.
2
Predicting risk factors for Epstein-Barr virus reactivation using Bayesian network analysis: a population-based study of high-risk areas for nasopharyngeal cancer.使用贝叶斯网络分析预测爱泼斯坦-巴尔病毒再激活的风险因素:一项基于人群的鼻咽癌高危地区研究。
Front Oncol. 2025 Jan 21;14:1369765. doi: 10.3389/fonc.2024.1369765. eCollection 2024.
3
Using Bayesian networks with Tabu-search algorithm to explore risk factors for hyperhomocysteinemia.

本文引用的文献

1
fastBMA: scalable network inference and transitive reduction.fastBMA:可扩展的网络推断和传递约简。
Gigascience. 2017 Oct 1;6(10):1-10. doi: 10.1093/gigascience/gix078.
2
NetBenchmark: a bioconductor package for reproducible benchmarks of gene regulatory network inference.NetBenchmark:一个用于基因调控网络推断可重复基准测试的生物导体包。
BMC Bioinformatics. 2015 Sep 29;16:312. doi: 10.1186/s12859-015-0728-4.
3
Fast Bayesian inference for gene regulatory networks using ScanBMA.使用ScanBMA对基因调控网络进行快速贝叶斯推理。
使用贝叶斯网络和禁忌搜索算法探索高同型半胱氨酸血症的危险因素。
Sci Rep. 2023 Jan 28;13(1):1610. doi: 10.1038/s41598-023-28123-z.
4
: A Novel Bayesian Network Structural Learning Algorithm and Its Comprehensive Performance Evaluation Against Open-Source Software.一种新的贝叶斯网络结构学习算法及其与开源软件的综合性能评估
J Comput Biol. 2020 May;27(5):698-708. doi: 10.1089/cmb.2019.0210. Epub 2019 Sep 5.
5
Databases and tools for constructing signal transduction networks in cancer.用于构建癌症信号转导网络的数据库和工具。
BMB Rep. 2017 Jan;50(1):12-19. doi: 10.5483/bmbrep.2017.50.1.135.
BMC Syst Biol. 2014 Apr 17;8:47. doi: 10.1186/1752-0509-8-47.
4
BNFinder2: Faster Bayesian network learning and Bayesian classification.BNFinder2:更快的贝叶斯网络学习和贝叶斯分类。
Bioinformatics. 2013 Aug 15;29(16):2068-70. doi: 10.1093/bioinformatics/btt323. Epub 2013 Jul 1.
5
Predicting spatial and temporal gene expression using an integrative model of transcription factor occupancy and chromatin state.利用转录因子占据和染色质状态的综合模型预测时空基因表达。
PLoS Comput Biol. 2012;8(12):e1002798. doi: 10.1371/journal.pcbi.1002798. Epub 2012 Dec 6.
6
Tissue-specific analysis of chromatin state identifies temporal signatures of enhancer activity during embryonic development.组织特异性分析染色质状态可确定胚胎发育过程中增强子活性的时间特征。
Nat Genet. 2012 Jan 8;44(2):148-56. doi: 10.1038/ng.1064.
7
Construction of regulatory networks using expression time-series data of a genotyped population.利用基因分型群体的表达时间序列数据构建调控网络。
Proc Natl Acad Sci U S A. 2011 Nov 29;108(48):19436-41. doi: 10.1073/pnas.1116442108. Epub 2011 Nov 14.
8
GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods.GeneNetWeaver:网络推理方法的计算机基准生成和性能分析。
Bioinformatics. 2011 Aug 15;27(16):2263-70. doi: 10.1093/bioinformatics/btr373. Epub 2011 Jun 22.
9
Inferring regulatory networks from expression data using tree-based methods.基于树的方法从表达数据推断调控网络。
PLoS One. 2010 Sep 28;5(9):e12776. doi: 10.1371/journal.pone.0012776.
10
Inferring the conservative causal core of gene regulatory networks.推断基因调控网络的保守因果核心。
BMC Syst Biol. 2010 Sep 28;4:132. doi: 10.1186/1752-0509-4-132.