Suppr超能文献

分布估计与最小相对熵原理。

The estimation of distributions and the minimum relative entropy principle.

作者信息

Mühlenbein Heinz, Höns Robin

机构信息

Fraunhofer Institute for Autonomous Intelligent Systems, 53754 Sankt Augustin, Germany.

出版信息

Evol Comput. 2005 Spring;13(1):1-27. doi: 10.1162/1063656053583469.

Abstract

Estimation of Distribution Algorithms (EDA) have been proposed as an extension of genetic algorithms. In this paper we explain the relationship of EDA to algorithms developed in statistics, artificial intelligence, and statistical physics. The major design issues are discussed within a general interdisciplinary framework. It is shown that maximum entropy approximations play a crucial role. All proposed algorithms try to minimize the Kullback-Leibler divergence KLD between the unknown distribution p(x) and a class q(x) of approximations. However, the Kullback-Leibler divergence is not symmetric. Approximations which suppose that the function to be optimized is additively decomposed (ADF) minimize KLD(q||p), the methods which learn the approximate model from data minimize KLD(p||q). This minimization is identical to maximizing the log-likelihood. In the paper three classes of algorithms are discussed. FDA uses the ADF to compute an approximate factorization of the unknown distribution. The factors are marginal distributions, whose values are computed from samples. The second class is represented by the Bethe-Kikuchi approach which has recently been rediscovered in statistical physics. Here the values of the marginals are computed from a difficult constrained minimization problem. The third class learns the factorization from the data. We analyze our learning algorithm LFDA in detail. It is shown that learning is faced with two problems: first, to detect the important dependencies between the variables, and second, to create an acyclic Bayesian network of bounded clique size.

摘要

分布估计算法(EDA)已被提出作为遗传算法的扩展。在本文中,我们解释了EDA与统计学、人工智能和统计物理学中所开发算法之间的关系。主要设计问题在一个通用的跨学科框架内进行讨论。结果表明,最大熵近似起着关键作用。所有提出的算法都试图最小化未知分布p(x)与近似类q(x)之间的库尔贝克-莱布勒散度(KLD)。然而,库尔贝克-莱布勒散度不是对称的。假设待优化函数可加分解(ADF)的近似方法最小化KLD(q||p),而从数据中学习近似模型的方法最小化KLD(p||q)。这种最小化等同于最大化对数似然。本文讨论了三类算法。FDA使用ADF来计算未知分布的近似因式分解。这些因子是边际分布,其值从样本中计算得出。第二类由贝叶斯-菊池方法代表,该方法最近在统计物理学中被重新发现。在这里,边际值是从一个困难的约束最小化问题中计算得出的。第三类从数据中学习因式分解。我们详细分析了我们的学习算法LFDA。结果表明,学习面临两个问题:第一,检测变量之间的重要依赖关系;第二,创建一个团块大小有界的无环贝叶斯网络。

相似文献

4
Space complexity of estimation of distribution algorithms.分布估计算法的空间复杂度。
Evol Comput. 2005 Spring;13(1):125-43. doi: 10.1162/1063656053583423.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验