一种新的贝叶斯网络结构学习算法及其与开源软件的综合性能评估

: A Novel Bayesian Network Structural Learning Algorithm and Its Comprehensive Performance Evaluation Against Open-Source Software.

机构信息

BERG Health, Framingham, Massachusetts, USA.

出版信息

J Comput Biol. 2020 May;27(5):698-708. doi: 10.1089/cmb.2019.0210. Epub 2019 Sep 5.

DOI:10.1089/cmb.2019.0210

PMID:31486672

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7232674/

Abstract

Structural learning of Bayesian networks (BNs) from observational data has gained increasing applied use and attention from various scientific and industrial areas. The mathematical theory of BNs and their optimization is well developed. Although there are several open-source BN learners in the public domain, none of them are able to handle both small and large feature space data and recover network structures with acceptable accuracy. is a novel BN learning and simulation software from BERG. It was developed with the goal of learning BNs from "Big Data" in health care, often exceeding hundreds of thousands features when research is conducted in genomics or multi-omics. This article provides a comprehensive performance evaluation of and its comparison with the open-source BN learners. The study investigated synthetic datasets of discrete, continuous, and mixed data in small and large feature space, respectively. The results demonstrated that outperformed the publicly available algorithms in structure recovery precision in almost all of the evaluated settings, achieving the true positive rates of 0.9 and precision of 0.8. In addition, supports all data types, including continuous, discrete, and mixed variables. It is effectively parallelized on a distributed system and can work with datasets of thousands of features that are infeasible for any of the publicly available tools with a desired level of recovery accuracy.

摘要

从观测数据中学习贝叶斯网络 (BN) 的结构已在各个科学和工业领域得到了越来越多的应用和关注。BN 的数学理论及其优化已经得到了很好的发展。尽管在公共领域有几个开源的 BN 学习者，但它们都无法处理小和大特征空间的数据，并以可接受的精度恢复网络结构。是 BERG 的一种新型 BN 学习和模拟软件。它的开发目标是从医疗保健领域的“大数据”中学习 BN，在进行基因组学或多组学研究时，其特征通常超过数十万。本文对进行了全面的性能评估，并与开源 BN 学习者进行了比较。该研究分别在小和大特征空间中对离散、连续和混合数据的合成数据集进行了调查。结果表明，在几乎所有评估设置中，在结构恢复精度方面都优于现有的公开算法，达到了 0.9 的真阳性率和 0.8 的精度。此外，支持所有数据类型，包括连续、离散和混合变量。它可以在分布式系统上有效地并行化，并可以处理数千个特征的数据集，而任何现有的公开工具都无法在所需的恢复精度水平上处理这些数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac80/7232674/9f91a2e61549/cmb.2019.0210_figure1.jpg

相似文献

: A Novel Bayesian Network Structural Learning Algorithm and Its Comprehensive Performance Evaluation Against Open-Source Software.一种新的贝叶斯网络结构学习算法及其与开源软件的综合性能评估

J Comput Biol. 2020 May;27(5):698-708. doi: 10.1089/cmb.2019.0210. Epub 2019 Sep 5.

CGBayesNets: conditional Gaussian Bayesian network learning and inference with mixed discrete and continuous data.CGBayesNets：混合离散和连续数据条件高斯贝叶斯网络学习与推理。

PLoS Comput Biol. 2014 Jun 12;10(6):e1003676. doi: 10.1371/journal.pcbi.1003676. eCollection 2014 Jun.

New Algorithm and Software (BNOmics) for Inferring and Visualizing Bayesian Networks from Heterogeneous Big Biological and Genetic Data.用于从异构生物大数据和遗传数据推断和可视化贝叶斯网络的新算法与软件（BNOmics）

J Comput Biol. 2017 Apr;24(4):340-356. doi: 10.1089/cmb.2016.0100. Epub 2016 Sep 28.

A data-driven feature learning approach based on Copula-Bayesian Network and its application in comparative investigation on risky lane-changing and car-following maneuvers.基于 Copula-Bayesian 网络的数据驱动特征学习方法及其在风险换道和跟驰行为比较研究中的应用。

Accid Anal Prev. 2021 May;154:106061. doi: 10.1016/j.aap.2021.106061. Epub 2021 Mar 7.

Growing Bayesian network models of gene networks from seed genes.从种子基因构建基因网络的贝叶斯网络增长模型。

Bioinformatics. 2005 Sep 1;21 Suppl 2:ii224-9. doi: 10.1093/bioinformatics/bti1137.

Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks.利用相关网络、图形高斯模型和贝叶斯网络对基因调控网络进行逆向工程的比较评估。

Bioinformatics. 2006 Oct 15;22(20):2523-31. doi: 10.1093/bioinformatics/btl391. Epub 2006 Jul 14.

A hybrid Bayesian network learning method for constructing gene networks.一种用于构建基因网络的混合贝叶斯网络学习方法。

Comput Biol Chem. 2007 Oct;31(5-6):361-72. doi: 10.1016/j.compbiolchem.2007.08.005. Epub 2007 Aug 19.

Advances to Bayesian network inference for generating causal networks from observational biological data.贝叶斯网络推理在从观测生物数据生成因果网络方面的进展。

Bioinformatics. 2004 Dec 12;20(18):3594-603. doi: 10.1093/bioinformatics/bth448. Epub 2004 Jul 29.

A novel algorithm for scalable and accurate Bayesian network learning.一种用于可扩展且准确的贝叶斯网络学习的新算法。

Stud Health Technol Inform. 2004;107(Pt 1):711-5.

SAGA: a hybrid search algorithm for Bayesian Network structure learning of transcriptional regulatory networks.SAGA：一种用于转录调控网络贝叶斯网络结构学习的混合搜索算法。

J Biomed Inform. 2015 Feb;53:27-35. doi: 10.1016/j.jbi.2014.08.010. Epub 2014 Aug 30.

引用本文的文献

Network Preservation Analysis Reveals Dysregulated Metabolic Pathways in Human Vascular Smooth Muscle Cell Phenotypic Switching.网络保存分析揭示了人类血管平滑肌细胞表型转换中代谢途径的失调。

Circ Genom Precis Med. 2023 Aug;16(4):372-381. doi: 10.1161/CIRCGEN.122.003781. Epub 2023 Jun 30.

Hypothesis-Agnostic Network-Based Analysis of Real-World Data Suggests Ondansetron is Associated with Lower COVID-19 Any Cause Mortality.基于假设无关网络的真实世界数据分析表明，昂丹司琼与降低新冠病毒感染所致任何原因死亡率相关。

Drugs Real World Outcomes. 2022 Sep;9(3):359-375. doi: 10.1007/s40801-022-00303-9. Epub 2022 Jul 9.

Biological Network Inference With GRASP: A Bayesian Network Structure Learning Method Using Adaptive Sequential Monte Carlo.使用GRASP进行生物网络推理：一种基于自适应序贯蒙特卡罗的贝叶斯网络结构学习方法

Front Genet. 2021 Nov 29;12:764020. doi: 10.3389/fgene.2021.764020. eCollection 2021.

Synthetic data generation with probabilistic Bayesian Networks.基于概率贝叶斯网络的合成数据生成。

Math Biosci Eng. 2021 Oct 9;18(6):8603-8621. doi: 10.3934/mbe.2021426.

Causal Datasheet for Datasets: An Evaluation Guide for Real-World Data Analysis and Data Collection Design Using Bayesian Networks.数据集因果数据表：使用贝叶斯网络进行现实世界数据分析和数据收集设计的评估指南。

Front Artif Intell. 2021 Apr 14;4:612551. doi: 10.3389/frai.2021.612551. eCollection 2021.

本文引用的文献

Distributed Bayesian networks reconstruction on the whole genome scale.全基因组规模上的分布式贝叶斯网络重建

PeerJ. 2018 Oct 19;6:e5692. doi: 10.7717/peerj.5692. eCollection 2018.

Predicting the functions of long noncoding RNAs using RNA-seq based on Bayesian network.基于贝叶斯网络利用RNA测序预测长链非编码RNA的功能

Biomed Res Int. 2015;2015:839590. doi: 10.1155/2015/839590. Epub 2015 Feb 28.

An introduction to causal inference.因果推断导论。

Int J Biostat. 2010 Feb 26;6(2):Article 7. doi: 10.2202/1557-4679.1203.

Systems of Mating. I. the Biometric Relations between Parent and Offspring.交配系统。一、亲本与子代之间的生物统计学关系。

Genetics. 1921 Mar;6(2):111-23. doi: 10.1093/genetics/6.2.111.

Microarray-based, high-throughput gene expression profiling of microRNAs.基于微阵列的微小RNA高通量基因表达谱分析。

Nat Methods. 2004 Nov;1(2):155-61. doi: 10.1038/nmeth717. Epub 2004 Oct 21.

An integrative genomics approach to the reconstruction of gene networks in segregating populations.一种用于重建分离群体中基因网络的整合基因组学方法。

Cytogenet Genome Res. 2004;105(2-4):363-74. doi: 10.1159/000078209.

A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes.用于分析微阵列表达数据的贝叶斯框架：正则化t检验与基因变化的统计推断

Bioinformatics. 2001 Jun;17(6):509-19. doi: 10.1093/bioinformatics/17.6.509.

Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks.利用图形模型和基因组表达数据对基因调控网络模型进行统计验证。

Pac Symp Biocomput. 2001:422-33. doi: 10.1142/9789814447362_0042.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种新的贝叶斯网络结构学习算法及其与开源软件的综合性能评估

: A Novel Bayesian Network Structural Learning Algorithm and Its Comprehensive Performance Evaluation Against Open-Source Software.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献