Baldi Pierre, Rosen-Zvi Michal
School of Information and Computer Sciences, University of California, Irvine, CA 92697-3425, USA.
Neural Netw. 2005 Oct;18(8):1080-6. doi: 10.1016/j.neunet.2005.07.007. Epub 2005 Sep 12.
Machine learning methods that can handle variable-size structured data such as sequences and graphs include Bayesian networks (BNs) and Recursive Neural Networks (RNNs). In both classes of models, the data is modeled using a set of observed and hidden variables associated with the nodes of a directed acyclic graph. In BNs, the conditional relationships between parent and child variables are probabilistic, whereas in RNNs they are deterministic and parameterized by neural networks. Here, we study the formal relationship between both classes of models and show that when the source nodes variables are observed, RNNs can be viewed as limits, both in distribution and probability, of BNs with local conditional distributions that have vanishing covariance matrices and converge to delta functions. Conditions for uniform convergence are also given together with an analysis of the behavior and exactness of Belief Propagation (BP) in 'deterministic' BNs. Implications for the design of mixed architectures and the corresponding inference algorithms are briefly discussed.
能够处理可变大小结构化数据(如序列和图)的机器学习方法包括贝叶斯网络(BNs)和递归神经网络(RNNs)。在这两类模型中,数据都是使用与有向无环图的节点相关联的一组观测变量和隐藏变量进行建模的。在贝叶斯网络中,父变量和子变量之间的条件关系是概率性的,而在递归神经网络中,它们是确定性的,并由神经网络进行参数化。在这里,我们研究了这两类模型之间的形式关系,并表明当源节点变量被观测到时,递归神经网络在分布和概率上都可以被视为具有局部条件分布的贝叶斯网络的极限,这些局部条件分布具有消失的协方差矩阵并收敛到狄拉克函数。还给出了一致收敛的条件,并对“确定性”贝叶斯网络中的信念传播(BP)的行为和准确性进行了分析。简要讨论了对混合架构设计和相应推理算法的影响。