Department of Immunology, Kyungpook National University School of Medicine, Daegu 41944, Republic of Korea.
Department of Physiology, Pusan National University School of Medicine, Yangsan 50612, Republic of Korea.
Sensors (Basel). 2024 Oct 3;24(19):6419. doi: 10.3390/s24196419.
In this study, we investigate the adaptability of artificial agents within a noisy T-maze that use Markov decision processes (MDPs) and successor feature (SF) and predecessor feature (PF) learning algorithms. Our focus is on quantifying how varying the hyperparameters, specifically the reward learning rate (αr) and the eligibility trace decay rate (λ), can enhance their adaptability. Adaptation is evaluated by analyzing the hyperparameters of cumulative reward, step length, adaptation rate, and adaptation step length and the relationships between them using Spearman's correlation tests and linear regression. Our findings reveal that an αr of 0.9 consistently yields superior adaptation across all metrics at a noise level of 0.05. However, the optimal setting for λ varies by metric and context. In discussing these results, we emphasize the critical role of hyperparameter optimization in refining the performance and transfer learning efficacy of learning algorithms. This research advances our understanding of the functionality of PF and SF algorithms, particularly in navigating the inherent uncertainty of transfer learning tasks. By offering insights into the optimal hyperparameter configurations, this study contributes to the development of more adaptive and robust learning algorithms, paving the way for future explorations in artificial intelligence and neuroscience.
在这项研究中,我们研究了在使用马尔可夫决策过程 (MDP) 和后继特征 (SF) 和前继特征 (PF) 学习算法的嘈杂 T 迷宫中,人工代理的适应性。我们的重点是量化改变超参数,特别是奖励学习率 (αr) 和资格迹衰减率 (λ),如何增强它们的适应性。通过分析累积奖励、步长、适应率和适应步长的超参数,以及使用 Spearman 相关检验和线性回归分析它们之间的关系,来评估适应能力。我们的研究结果表明,在噪声水平为 0.05 时,αr 为 0.9 始终在所有指标上产生优越的适应能力。然而,λ 的最佳设置因指标和上下文而异。在讨论这些结果时,我们强调了超参数优化在改进学习算法的性能和迁移学习效果方面的关键作用。这项研究增进了我们对 PF 和 SF 算法功能的理解,特别是在处理迁移学习任务中的固有不确定性方面。通过提供有关最佳超参数配置的见解,本研究有助于开发更具适应性和鲁棒性的学习算法,为人工智能和神经科学的未来探索铺平道路。