• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有博弈反馈的推和分布式在线优化

Push-Sum Distributed Online Optimization With Bandit Feedback.

作者信息

Wang Cong, Xu Shengyuan, Yuan Deming, Zhang Baoyong, Zhang Zhengqiang

出版信息

IEEE Trans Cybern. 2022 Apr;52(4):2263-2273. doi: 10.1109/TCYB.2020.2999309. Epub 2022 Apr 5.

DOI:10.1109/TCYB.2020.2999309
PMID:32609617
Abstract

In this article, we concentrate on distributed online convex optimization problems over multiagent systems, where the communication between nodes is represented by a class of directed graphs that are time varying and uniformly strongly connected. This problem is in bandit feedback, in the sense that at each time only the cost function value at the committed point is revealed to each node. Then, nodes update their decisions by exchanging information with their neighbors only. To deal with Lipschitz continuous and strongly convex cost functions, a distributed online convex optimization algorithm that achieves sublinear individual regret for every node is developed. The algorithm is built on the algorithm called the push-sum scheme that releases the request of doubly stochastic weight matrices, and the one-point gradient estimator that requires the function value at only one point at every iteration, instead of the gradient information of loss function. The expected regret of our proposed algorithm scales as O (T ln(T)) , and T is the number of iterations. To validate the performance of the algorithm developed in this article, we give a simulation of a common numerical example.

摘要

在本文中,我们专注于多智能体系统上的分布式在线凸优化问题,其中节点之间的通信由一类时变且一致强连通的有向图表示。该问题处于带反馈的情况,即每次仅向每个节点揭示在已提交点处的成本函数值。然后,节点仅通过与邻居交换信息来更新其决策。为了处理利普希茨连续且强凸的成本函数,我们开发了一种分布式在线凸优化算法,该算法为每个节点实现了次线性个体遗憾值。该算法基于称为推和方案的算法构建,该方案放宽了对双随机权重矩阵的要求,以及单点梯度估计器,该估计器在每次迭代时仅需要一个点处的函数值,而不是损失函数的梯度信息。我们提出的算法的期望遗憾值按(O (T \ln(T)))缩放,其中(T)是迭代次数。为了验证本文中开发的算法的性能,我们给出了一个常见数值示例的模拟。

相似文献

1
Push-Sum Distributed Online Optimization With Bandit Feedback.具有博弈反馈的推和分布式在线优化
IEEE Trans Cybern. 2022 Apr;52(4):2263-2273. doi: 10.1109/TCYB.2020.2999309. Epub 2022 Apr 5.
2
Distributed Online Constrained Optimization With Feedback Delays.具有反馈延迟的分布式在线约束优化
IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):1708-1720. doi: 10.1109/TNNLS.2022.3184957. Epub 2024 Feb 5.
3
Distributed Online Stochastic-Constrained Convex Optimization With Bandit Feedback.具有博弈反馈的分布式在线随机约束凸优化
IEEE Trans Cybern. 2024 Jan;54(1):63-75. doi: 10.1109/TCYB.2022.3177644. Epub 2023 Dec 20.
4
Online Learning Algorithm for Distributed Convex Optimization With Time-Varying Coupled Constraints and Bandit Feedback.具有时变耦合约束和博弈反馈的分布式凸优化在线学习算法
IEEE Trans Cybern. 2022 Feb;52(2):1009-1020. doi: 10.1109/TCYB.2020.2990796. Epub 2022 Feb 16.
5
Privacy Masking Stochastic Subgradient-Push Algorithm for Distributed Online Optimization.隐私掩蔽随机梯度推进算法在分布式在线优化中的应用。
IEEE Trans Cybern. 2021 Jun;51(6):3224-3237. doi: 10.1109/TCYB.2020.2973221. Epub 2021 May 18.
6
Stochastic Strongly Convex Optimization via Distributed Epoch Stochastic Gradient Algorithm.通过分布式轮次随机梯度算法实现的随机强凸优化
IEEE Trans Neural Netw Learn Syst. 2021 Jun;32(6):2344-2357. doi: 10.1109/TNNLS.2020.3004723. Epub 2021 Jun 2.
7
An Adaptive Primal-Dual Subgradient Algorithm for Online Distributed Constrained Optimization.一种用于在线分布式约束优化的自适应对偶子梯度算法。
IEEE Trans Cybern. 2018 Nov;48(11):3045-3055. doi: 10.1109/TCYB.2017.2755720. Epub 2017 Oct 5.
8
Distributed Randomized Gradient-Free Optimization Protocol of Multiagent Systems Over Weight-Unbalanced Digraphs.
IEEE Trans Cybern. 2021 Jan;51(1):473-482. doi: 10.1109/TCYB.2018.2890140. Epub 2020 Dec 22.
9
Distributed Actor-Critic Algorithms for Multiagent Reinforcement Learning Over Directed Graphs.
IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):7210-7221. doi: 10.1109/TNNLS.2021.3139138. Epub 2023 Oct 5.
10
Distributed Stochastic Gradient Tracking Algorithm With Variance Reduction for Non-Convex Optimization.用于非凸优化的具有方差缩减的分布式随机梯度跟踪算法
IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5310-5321. doi: 10.1109/TNNLS.2022.3170944. Epub 2023 Sep 1.