• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有时变耦合约束和博弈反馈的分布式凸优化在线学习算法

Online Learning Algorithm for Distributed Convex Optimization With Time-Varying Coupled Constraints and Bandit Feedback.

作者信息

Li Jueyou, Gu Chuanye, Wu Zhiyou, Huang Tingwen

出版信息

IEEE Trans Cybern. 2022 Feb;52(2):1009-1020. doi: 10.1109/TCYB.2020.2990796. Epub 2022 Feb 16.

DOI:10.1109/TCYB.2020.2990796
PMID:32452789
Abstract

This article focuses on multiagent distributed-constrained optimization problems in a dynamic environment, in which a group of agents aims to cooperatively optimize a sum of time-changing local cost functions subject to time-varying coupled constraints. Both the local cost functions and constraint functions are unrevealed to an individual agent until an action is submitted. We first investigate a gradient-feedback scenario, where each agent can access both values and gradients of cost functions and constraint functions owned by itself at the chosen action. Then, we design a distributed primal-dual online learning algorithm and show that the proposed algorithm can achieve the sublinear bounds for both the regret and constraint violations. Furthermore, we extend the gradient-feedback algorithm to a gradient-free setup, where an individual agent has only attained the values of local cost functions and constraint functions at two queried points near the selected action. We develop a bandit version of the previous method and give the explicitly sublinear bounds on the expected regret and expected constraint violations. The results indicate that the bandit algorithm can achieve almost the same performance as the gradient-feedback algorithm under wild conditions. Finally, numerical simulations on an electric vehicle charging problem demonstrate the effectiveness of the proposed algorithms.

摘要

本文聚焦于动态环境下的多智能体分布式约束优化问题,其中一组智能体旨在协同优化随时间变化的局部成本函数之和,并受随时间变化的耦合约束。在提交动作之前,局部成本函数和约束函数对单个智能体都是不可见的。我们首先研究一种梯度反馈情形,即每个智能体在选定动作时可以获取自身拥有的成本函数和约束函数的值与梯度。然后,我们设计了一种分布式原始对偶在线学习算法,并表明所提算法能在遗憾值和约束违反方面都达到次线性界。此外,我们将梯度反馈算法扩展到无梯度设置,即单个智能体仅能获取在所选动作附近两个查询点处的局部成本函数和约束函数的值。我们开发了先前方法的一种强化学习版本,并给出了预期遗憾值和预期约束违反的显式次线性界。结果表明,在宽泛条件下,强化学习算法能实现与梯度反馈算法几乎相同的性能。最后,针对电动汽车充电问题的数值模拟证明了所提算法的有效性。

相似文献

1
Online Learning Algorithm for Distributed Convex Optimization With Time-Varying Coupled Constraints and Bandit Feedback.具有时变耦合约束和博弈反馈的分布式凸优化在线学习算法
IEEE Trans Cybern. 2022 Feb;52(2):1009-1020. doi: 10.1109/TCYB.2020.2990796. Epub 2022 Feb 16.
2
Distributed Online Stochastic-Constrained Convex Optimization With Bandit Feedback.具有博弈反馈的分布式在线随机约束凸优化
IEEE Trans Cybern. 2024 Jan;54(1):63-75. doi: 10.1109/TCYB.2022.3177644. Epub 2023 Dec 20.
3
Distributed Online Constrained Optimization With Feedback Delays.具有反馈延迟的分布式在线约束优化
IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):1708-1720. doi: 10.1109/TNNLS.2022.3184957. Epub 2024 Feb 5.
4
Push-Sum Distributed Online Optimization With Bandit Feedback.具有博弈反馈的推和分布式在线优化
IEEE Trans Cybern. 2022 Apr;52(4):2263-2273. doi: 10.1109/TCYB.2020.2999309. Epub 2022 Apr 5.
5
Distributed Online Learning Algorithm for Noncooperative Games Over Unbalanced Digraphs.非平衡有向图上非合作博弈的分布式在线学习算法
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15846-15856. doi: 10.1109/TNNLS.2023.3290049. Epub 2024 Oct 29.
6
An Adaptive Primal-Dual Subgradient Algorithm for Online Distributed Constrained Optimization.一种用于在线分布式约束优化的自适应对偶子梯度算法。
IEEE Trans Cybern. 2018 Nov;48(11):3045-3055. doi: 10.1109/TCYB.2017.2755720. Epub 2017 Oct 5.
7
Distributed Primal-Dual Subgradient Method for Multiagent Optimization via Consensus Algorithms.基于共识算法的多智能体优化分布式原始对偶次梯度法
IEEE Trans Syst Man Cybern B Cybern. 2011 Dec;41(6):1715-24. doi: 10.1109/TSMCB.2011.2160394. Epub 2011 Aug 4.
8
Distributed Frank-Wolfe Solver for Stochastic Optimization With Coupled Inequality Constraints.用于具有耦合不等式约束的随机优化的分布式Frank-Wolfe求解器
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):7858-7872. doi: 10.1109/TNNLS.2024.3423376. Epub 2025 May 2.
9
Distributed inertial online game algorithm for tracking generalized Nash equilibria.
Chaos. 2023 Oct 1;33(10). doi: 10.1063/5.0155863.
10
Distributed Optimization for Second-Order Discrete-Time Multiagent Systems With Set Constraints.具有集合约束的二阶离散时间多智能体系统的分布式优化
IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5629-5639. doi: 10.1109/TNNLS.2021.3130173. Epub 2023 Sep 1.