Mabu Shingo, Hirasawa Kotaro, Hu Jinglu
Graduate School of Information, Production and Systems, Waseda University, Hibikino 2-7 Wakamatsu-ku, Kitakyushu, Fukuoka, 808-0135, Japan.
Evol Comput. 2007 Fall;15(3):369-98. doi: 10.1162/evco.2007.15.3.369.
This paper proposes a graph-based evolutionary algorithm called Genetic Network Programming (GNP). Our goal is to develop GNP, which can deal with dynamic environments efficiently and effectively, based on the distinguished expression ability of the graph (network) structure. The characteristics of GNP are as follows. 1) GNP programs are composed of a number of nodes which execute simple judgment/processing, and these nodes are connected by directed links to each other. 2) The graph structure enables GNP to re-use nodes, thus the structure can be very compact. 3) The node transition of GNP is executed according to its node connections without any terminal nodes, thus the past history of the node transition affects the current node to be used and this characteristic works as an implicit memory function. These structural characteristics are useful for dealing with dynamic environments. Furthermore, we propose an extended algorithm, "GNP with Reinforcement Learning (GNPRL)" which combines evolution and reinforcement learning in order to create effective graph structures and obtain better results in dynamic environments. In this paper, we applied GNP to the problem of determining agents' behavior to evaluate its effectiveness. Tileworld was used as the simulation environment. The results show some advantages for GNP over conventional methods.
本文提出了一种基于图的进化算法——遗传网络编程(GNP)。我们的目标是基于图(网络)结构卓越的表达能力,开发一种能够高效且有效地处理动态环境的GNP。GNP的特点如下:1)GNP程序由多个执行简单判断/处理的节点组成,这些节点通过有向链接相互连接。2)图结构使GNP能够重用节点,因此结构可以非常紧凑。3)GNP的节点转移根据其节点连接执行,没有任何终端节点,因此节点转移的过去历史会影响当前要使用的节点,并且此特性起到隐式记忆功能的作用。这些结构特性对于处理动态环境很有用。此外,我们提出了一种扩展算法,即“带强化学习的GNP(GNPRL)”,它将进化与强化学习相结合,以创建有效的图结构并在动态环境中获得更好的结果。在本文中,我们将GNP应用于确定智能体行为的问题以评估其有效性。Tileworld被用作模拟环境。结果显示GNP相对于传统方法具有一些优势。