IEEE Trans Cybern. 2020 Oct;50(10):4214-4227. doi: 10.1109/TCYB.2019.2906574. Epub 2019 Apr 11.
Agent advising is one of the key approaches to improve agent learning performance by enabling agents to ask for advice between each other. Existing agent advising approaches have two limitations. The first limitation is that all the agents in a system are assumed to be friendly and cooperative. However, in the real world, malicious agents may exist and provide false advice to hinder the learning performance of other agents. The second limitation is that the analysis of communication overhead in these approaches is either overlooked or simplified. However, in communication-constrained environments, communication overhead has to be carefully considered. To overcome the two limitations, this paper proposes a novel differentially private agent advising approach. Our approach employs the Laplace mechanism to add noise on the rewards used by student agents to select teacher agents. By using the differential privacy technique, the proposed approach can reduce the impact of malicious agents without identifying them. Also, by adopting the privacy budget concept, the proposed approach can naturally control communication overhead. The experimental results demonstrate the effectiveness of the proposed approach.
代理咨询是一种通过使代理能够在彼此之间相互询问来提高代理学习性能的关键方法。现有的代理咨询方法存在两个局限性。第一个限制是系统中的所有代理都被假设为友好和协作的。然而,在现实世界中,恶意代理可能存在并提供虚假建议来阻碍其他代理的学习性能。第二个限制是这些方法中的通信开销分析要么被忽视要么被简化。然而,在通信受限的环境中,必须仔细考虑通信开销。为了克服这两个限制,本文提出了一种新的差分隐私代理咨询方法。我们的方法使用拉普拉斯机制在学生代理用于选择教师代理的奖励上添加噪声。通过使用差分隐私技术,所提出的方法可以在不识别恶意代理的情况下降低它们的影响。此外,通过采用隐私预算概念,所提出的方法可以自然地控制通信开销。实验结果证明了所提出方法的有效性。