Oommen B John, Kim Sang-Woon, Samuel Mathew T, Granmo Ole-Christoffer
School of Computer Science, Carleton University, Ottawa, Ontario, Canada.
IEEE Trans Syst Man Cybern B Cybern. 2008 Apr;38(2):466-76. doi: 10.1109/TSMCB.2007.913602.
This paper reports the first known solution to the stochastic point location (SPL) problem when the environment is nonstationary. The SPL problem involves a general learning problem in which the learning mechanism (which could be a robot, a learning automaton, or, in general, an algorithm) attempts to learn a "parameter," for example, lambda*, within a closed interval. However, unlike the earlier reported results, we consider the scenario when the learning is to be done in a nonstationary setting. For each guess, the environment essentially informs the mechanism, possibly erroneously (i.e., with probability p), which way it should move to reach the unknown point. Unlike the results available in the literature, we consider the fascinating case when the point sought for is itself stochastically moving (which is modeled as follows). The environment communicates with an intermediate entity (referred to as the teacher/oracle) about the point itself, i.e., advising where it should go. The mechanism that searches for the point in turn receives responses from the teacher/oracle, which directs how it should move. Therefore, the point itself, in the overall setting, is moving, i.e., delivering possibly incorrect information about its location to the teacher/oracle. This in turn means that the "environment" is itself nonstationary, which implies that the advice of the teacher/oracle is both uncertain and changing with time-rendering the problem extremely fascinating. The heart of the strategy we propose involves discretizing the space and performing a controlled random walk on this space. Apart from deriving some analytic results about our solution, we also report the simulation results that demonstrate the power of the scheme, and state some potential applications.
本文报道了在环境非平稳时随机点定位(SPL)问题的首个已知解决方案。SPL问题涉及一个一般的学习问题,其中学习机制(可以是机器人、学习自动机,或者一般来说是一种算法)试图在一个闭区间内学习一个“参数”,例如λ*。然而,与早期报道的结果不同,我们考虑的是在非平稳环境中进行学习的情况。对于每次猜测,环境本质上会告知该机制,可能会有误(即概率为p),它应该朝哪个方向移动才能到达未知点。与文献中已有的结果不同,我们考虑了一种引人入胜的情况,即所寻找的点本身在随机移动(其建模方式如下)。环境与一个中间实体(称为教师/预言机)就该点本身进行通信,即告知它应该往哪里去。反过来,搜索该点的机制会从教师/预言机接收响应,教师/预言机指示它应该如何移动。因此,在整个场景中,该点本身在移动,即向教师/预言机传递关于其位置的可能不正确的信息。这反过来意味着“环境”本身是非平稳的,这意味着教师/预言机的建议既不确定又随时间变化——这使得该问题极具吸引力。我们提出的策略的核心涉及对空间进行离散化,并在这个空间上进行受控随机游走。除了推导关于我们解决方案的一些分析结果外,我们还报告了模拟结果,这些结果证明了该方案的有效性,并阐述了一些潜在应用。