Barth Vitor, Serrão Fábio, Maciel Carlos
Department of Electrical and Computing Engineering, University of Sao Paulo, São Carlos 13566-590, SP, Brazil.
Department of Physical Therapy, Federal University of São Carlos, São Carlos 13565-905, SP, Brazil.
Entropy (Basel). 2024 Sep 30;26(10):829. doi: 10.3390/e26100829.
Learning Bayesian networks from data aims to create a Directed Acyclic Graph that encodes significant statistical relationships between variables and their joint probability distributions. However, when using real-world data with limited knowledge of the original dynamical system, it is challenging to determine if the learned DAG accurately reflects the underlying relationships, especially when the data come from multiple independent sources. This paper describes a methodology capable of assessing the credible interval for the existence and direction of each edge within Bayesian networks learned from data, without previous knowledge of the underlying dynamical system. It offers several advantages over classical methods, such as data fusion from multiple sources, identification of latent variables, and extraction of the most prominent edges with their respective credible interval. The method is evaluated using simulated datasets of various sizes and a real use case. Our approach was verified to achieve results comparable to the most recent studies in the field, while providing more information on the model's credibility.
从数据中学习贝叶斯网络旨在创建一个有向无环图,该图对变量及其联合概率分布之间的重要统计关系进行编码。然而,当使用对原始动态系统了解有限的现实世界数据时,很难确定学习到的有向无环图是否准确反映了潜在关系,尤其是当数据来自多个独立来源时。本文描述了一种方法,该方法能够在无需了解潜在动态系统的先验知识的情况下,评估从数据中学习到的贝叶斯网络中每条边的存在和方向的可信区间。与经典方法相比,它具有多个优势,例如多源数据融合、潜在变量识别以及提取具有各自可信区间的最显著边。该方法使用各种大小的模拟数据集和一个实际用例进行了评估。我们的方法经过验证,能够取得与该领域最新研究相当的结果,同时提供有关模型可信度的更多信息。