Taioli Francesco, Giuliari Francesco, Wang Yiming, Berra Riccardo, Castellini Alberto, Bue Alessio Del, Farinelli Alessandro, Cristani Marco, Setti Francesco
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):11047-11058. doi: 10.1109/TPAMI.2024.3451994. Epub 2024 Nov 6.
We propose a solution for Active Visual Search of objects in an environment, whose 2D floor map is the only known information. Our solution has three key features that make it more plausible and robust to detector failures compared to state-of-the-art methods: i) it is unsupervised as it does not need any training sessions. ii) During the exploration, a probability distribution on the 2D floor map is updated according to an intuitive mechanism, while an improved belief update increases the effectiveness of the agent's exploration. iii) We incorporate the awareness that an object detector may fail into the aforementioned probability modelling by exploiting the success statistics of a specific detector. Our solution is dubbed POMP-BE-PD (Pomcp-based Online Motion Planning with Belief by Exploration and Probabilistic Detection). It uses the current pose of an agent and an RGB-D observation to learn an optimal search policy, exploiting a POMDP solved by a Monte-Carlo planning approach. On the Active Vision Dataset Benchmark, we increase the average success rate over all the environments by a significant 35 % while decreasing the average path length by 4 % with respect to competing methods. Thus, our results are state-of-the-art, even without any training procedure.
我们提出了一种在环境中对物体进行主动视觉搜索的解决方案,该环境中唯一已知的信息是其二维地面地图。与现有方法相比,我们的解决方案具有三个关键特性,使其在面对探测器故障时更具合理性和鲁棒性:i)它是无监督的,因为不需要任何训练环节。ii)在探索过程中,二维地面地图上的概率分布根据一种直观机制进行更新,同时改进的置信度更新提高了智能体探索的有效性。iii)我们通过利用特定探测器的成功统计数据,将物体探测器可能出现故障这一认知纳入上述概率建模中。我们的解决方案被称为POMP - BE - PD(基于蒙特卡洛规划的在线运动规划,通过探索和概率检测实现置信度估计)。它利用智能体的当前位姿和RGB - D观测来学习最优搜索策略,采用蒙特卡洛规划方法求解的部分可观测马尔可夫决策过程(POMDP)。在主动视觉数据集基准测试中,相较于竞争方法,我们在所有环境下将平均成功率显著提高了35%,同时将平均路径长度缩短了4%。因此,即使没有任何训练过程,我们的结果也达到了当前的最优水平。