浏览全部资源
扫码关注微信
1. 浪潮通信信息系统有限公司,山东 济南 250100
2. 中国联合网络通信有限公司研究院,北京 100048
[ "沈林江(1981- ),男,浪潮通信信息系统有限公司副总经理、算力网络研究院院长,主要从事算力网络相关前沿理论分析、技术研究和产品设计等工作" ]
[ "曹畅(1984- ),男,博士,中国联合网络通信有限公司研究院未来网络研究部总监、高级工程师,主要从事算力网络、IPv6+网络新技术、未来网络体系架构等研究工作" ]
[ "崔超(1993- ),男,现就职于浪潮通信信息系统有限公司,主要从事算力网络、AI算法等相关研究工作" ]
[ "张岩(1983- ),男,博士,中国联合网络通信有限公司研究院未来网络研究部主任研究员、高级工程师,主要从事算力网络、云网融合/云计算、未来网络体系架构等研究工作" ]
网络出版日期:2023-08,
纸质出版日期:2023-08-20
移动端阅览
沈林江, 曹畅, 崔超, 等. 基于策略约束强化学习的算网多目标优化研究[J]. 电信科学, 2023,39(8):136-148.
Linjiang SHEN, Chang CAO, Chao CUI, et al. Research on constrained policy reinforcement learning based multi-objective optimization of computing power network[J]. Telecommunications science, 2023, 39(8): 136-148.
沈林江, 曹畅, 崔超, 等. 基于策略约束强化学习的算网多目标优化研究[J]. 电信科学, 2023,39(8):136-148. DOI: 10.11959/j.issn.1000-0801.2023165.
Linjiang SHEN, Chang CAO, Chao CUI, et al. Research on constrained policy reinforcement learning based multi-objective optimization of computing power network[J]. Telecommunications science, 2023, 39(8): 136-148. DOI: 10.11959/j.issn.1000-0801.2023165.
算力网络需要在满足用户业务需求的基础上最大化系统性能指标,现有方法主要通过多目标加权进行转换和求解,存在超参数难以确定、跨场景适用性差等问题。在分析算网目标特性的基础上,基于策略约束强化学习,将业务需求作为约束、系统性能指标作为优化目标,通过价值—策略—超参数的多级迭代策略,实现算网对用户业务需求的期望确定性保障和对系统性能的最优化。同时,研究了针对超参数寻优的多尺度步长(multi-scale step length,MSL)方法,进一步提升了系统的稳定性和准确性。仿真结果表明,所提方法在系统架构和负载变化情况下均具有良好的收敛性和稳定性。
The computing power network needs to maximize the system performance index on the basis of meeting user business needs
and the existing methods are mainly based on the multi-objective weighting method
which has problems such as difficult to determine hyperparameters and poor cross-scenario applicability.Based on this
based on the analysis of the characteristics of the computing power network target
the user business requirements were taken as the policy constraints
and the performance indicators of the computing power network was taken as the optimization goal based on constrained policy optimization
and the expectation certainty of user business needs and the optimization of system performance through the value-strategy-hyper-parameter multi-level iterative strategy was realized.At the same time
the multi-scale step length (MSL) method for hyper-parameter optimization was studied
which further improved the stability and accuracy of the system.Simulation results show that the proposed method has good convergence and stability under the conditions of single terminal-single edge server
multi-terminal-multi-edge server and system load change.
TANG X Y , CAO C , WANG Y X , et al . Computing power network:the architecture of convergence of computing and networking towards 6G requirement [J ] . China Communications , 2021 , 18 ( 2 ): 175 - 185 .
雷波 , 赵倩颖 , 赵慧玲 . 边缘计算与算力网络综述 [J ] . 中兴通讯技术 , 2021 , 27 ( 3 ): 3 - 6 .
LEI B , ZHAO Q Y , ZHAO H L . Overview of edge computing and computing power network [J ] . ZTE Technology Journal , 2021 , 27 ( 3 ): 3 - 6 .
雷波 , 刘增义 , 王旭亮 , 等 . 基于云、网、边融合的边缘计算新方案:算力网络 [J ] . 电信科学 , 2019 , 35 ( 9 ): 44 - 51 .
LEI B , LIU Z Y , WANG X L , et al . Computing network:a new multi-access edge computing [J ] . Telecommunications Science , 2019 , 35 ( 9 ): 44 - 51 .
李建飞 , 曹畅 , 李奥 , 等 . 算力网络中面向业务体验的算力建模 [J ] . 中兴通讯技术 , 2020 , 26 ( 5 ): 34 - 38 , 52 .
LI J F , CAO C , LI A , et al . Computing power modeling for business experience in computing power network [J ] . ZTE Technology Journal , 2020 , 26 ( 5 ): 34 - 38 , 52 .
何涛 , 杨振东 , 曹畅 , 等 . 算力网络发展中的若干关键技术问题分析 [J ] . 电信科学 , 2022 , 38 ( 6 ): 62 - 70 .
HE T , YANG Z D , CAO C , et al . Analysis of some key technical problems in the development of computing power network [J ] . Telecommunications Science , 2022 , 38 ( 6 ): 62 - 70 .
KHAN W Z , AHMED E , HAKAK S , et al . Edge computing:a survey [J ] . Future Generation Computer Systems , 2019 , 97 ( C ): 219 - 235 .
MAO Y Y , ZHANG J , SONG S H , et al . Stochastic joint radio and computational resource management for multi-user mobile-edge computing systems [J ] . IEEE Transactions on Wireless Communications , 2017 , 16 ( 9 ): 5994 - 6009 .
MOUSAVI S S , SCHUKAT M , HOWLEY E . Deep reinforcement learning:an overview [C ] // Proceedings of SAI Intelligent Systems Conference (IntelliSys) . Heidelberg:Springer , 2016 : 426 - 440 .
LI Y , ZHANG X , ZENG T , et al . Task placement and resource allocation for edge machine learning:a GNN-based multi-agent reinforcement learning paradigm [J ] . arXiv preprint , 2023 ,arXiv:2302.00571.
ALE L H , ZHANG N , FANG X J , et al . Delay-aware and energy-efficient computation offloading in mobile-edge computing using deep reinforcement learning [J ] . IEEE Transactions on Cognitive Communications and Networking , 2021 , 7 ( 3 ): 881 - 892 .
LI M S , GAO J , ZHAO L , et al . Deep reinforcement learning for collaborative edge computing in vehicular networks [J ] . IEEE Transactions on Cognitive Communications and Networking , 2020 , 6 ( 4 ): 1122 - 1135 .
YANG A , WU M , CHENG B , et al . Reinforcement learning in computing and network convergence orchestration [J ] . arXiv preprint , 2022 ,arXiv:2209.10753.
JAIN T , AVANEESH , VERMA R , et al . Latency-memory optimized splitting of convolution neural networks for resource constrained edge devices [C ] // Proceedings of 2022 14th International Conference on Communication Systems & Networks(COMSNETS) . Piscataway:IEEE Press , 2022 : 531 - 539 .
TESSLER C , MANKOWITZ D J , MANNOR S . Reward constrained policy optimization [J ] . arXiv preprint , 2018 ,arXiv:1805.11074.
ZHUANG S , GAO C X , HE Y , et al . QC-DQN:a novel constrained reinforcement learning method for computation offloading in multi-access edge computing [C ] // Proceedings of 2022 International Joint Conference on Neural Networks (IJCNN) . Piscataway:IEEE Press , 2022 : 1 - 8 .
BHATNAGAR S , LAKSHMANAN K . An online actor-critic algorithm with function approximation for constrained Markov decision processes [J ] . Journal of Optimization Theory and Applications , 2012 , 153 ( 3 ): 688 - 708 .
ACHIAM J , HELD D , TAMAR A , et al . Constrained policy optimization [J ] . arXiv preprint , 2017 ,arXiv:1705.10528.
0
浏览量
252
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构