浏览全部资源
扫码关注微信
[ "陈一波(1998- ),男,杭州电子科技大学通信工程学院硕士生,主要研究方向为认知无线电" ]
[ "赵知劲(1959- ),女,博士,杭州电子科技大学通信工程学院教授、博士生导师,主要研究方向为信号处理、认知无线电技术" ]
网络出版日期:2022-11,
纸质出版日期:2022-11-20
移动端阅览
陈一波, 赵知劲. 基于ET-PPO的双变跳频图案智能决策[J]. 电信科学, 2022,38(11):86-95.
Yibo CHEN, Zhijin ZHAO. Intelligent anti-jamming decision algorithm of bivariate frequency hopping pattern based on ET-PPO[J]. Telecommunications science, 2022, 38(11): 86-95.
陈一波, 赵知劲. 基于ET-PPO的双变跳频图案智能决策[J]. 电信科学, 2022,38(11):86-95. DOI: 10.11959/j.issn.1000-0801.2022264.
Yibo CHEN, Zhijin ZHAO. Intelligent anti-jamming decision algorithm of bivariate frequency hopping pattern based on ET-PPO[J]. Telecommunications science, 2022, 38(11): 86-95. DOI: 10.11959/j.issn.1000-0801.2022264.
为进一步提高双变跳频系统在复杂电磁环境中的抗干扰能力,提出了一种基于资格迹的近端策略优化(proximal policy optimization with eligibility traces,ET-PPO)算法。在传统跳频图案的基础上,引入时变参数,通过状态-动作-奖励三元组的构造将“双变”跳频图案决策问题建模为马尔可夫决策问题。针对 PPO算法“行动器”网络样本更新方式的高方差问题,引入加权重要性采样减小方差;采用Beta分布的动作选择策略,增强学习阶段的稳定性。针对“评判器”网络收敛速度慢的问题,引入资格迹方法,较好地平衡了收敛速度和全局最优解求解。在不同电磁干扰环境下的算法对比仿真结果表明,ET-PPO有更好的适应性和稳定性,对抗阻塞干扰和扫频干扰表现较好。
In order to further improve its anti-interference ability in complex electromagnetic environment
a PPO algorithm based on weighted importance sampling and eligibility traces (ET-PPO) was proposed.On the basis of the traditional frequency hopping pattern
time-varying parameters were introduced
and the bivariate frequency hopping pattern decision problem was modeled as a Markov decision problem through the construction of the state-action-reward triple.Aiming at the high variance problem of the sample update method of an actor network of the PPO algorithm
weighted importance sampling was introduced to reduce the variance
and the action selection strategy of Beta distribution was used to enhance the stability of the learning stage.Aiming at the problem of slow convergence speed of the evaluator network
the eligibility trace method was introduced
which better balanced the convergence speed and the global optimal solution.The algorithm comparison simulation results in different electromagnetic interference environments show that ET-PPO has better adaptability and stability
and has better performance against obstruction interference and sweep frequency interference.
任兴旌 . 跳频通信关键技术研究及系统设计 [D ] . 兰州:兰州交通大学 , 2018 .
REN X J . Key technology research and system design of frequency hopping communication [D ] . Lanzhou:Lanzhou Jiatong University , 2018 .
柳永祥 , 姚富强 , 梁涛 . 变间隔、变跳速跳频通信技术 [C ] // 军事电子信息学术会议 . 2006 : 518 - 521 .
LIU Y X , YAO F Q , LIANG T . Bivariate frequency hopping communication technology [C ] // Academic Conference on Military Electronic Information . 2006 : 518 - 521 .
严季 , 梁涛 , 祈竹 . 变跳速、变间隔跳频通信技术研究 [J ] . 无线通信技术 , 2012 , 21 ( 4 ): 25 - 29 .
YAN J , LIANG T , QI Z . Research on thefrequenct hopping communication technology of variable hopping rate and variable interval [J ] . Wireless Communication Technology , 2012 , 21 ( 4 ): 25 - 29 .
汪小林 , 黎亮 , 张抒 . 基于均匀性补偿的跳频图案生成方法 [J ] . 兵工自动化 , 2018 , 37 ( 9 ): 12 - 14 .
WANG X L , LI L , ZHANG S . Frequency hopping based on uniformity compensation [J ] . Ordnance Industry Automation , 2018 , 37 ( 9 ): 12 - 14 .
李金涛 . 宽间隔跳频序列设计与性能研究 [D ] . 成都:西南交通大学 , 2007 .
LI J T . Study on frequency hopping sequences with givenminimumgap [D ] . Chengdu:Southwest Jiaotong University , 2007 .
陈刚 , 黎福海 . 变速跳频通信抗跟踪干扰性能的研究 [J ] . 火力与指挥控制 , 2016 , 41 ( 7 ): 107 - 109 .
CHEN G , LI F H . Research on anti-follower jamming performance of variable rate frequency hopping communications [J ] . Fire Control & Command Control , 2016 , 41 ( 7 ): 107 - 109 .
王越超 . 自适应跳频通信系统关键技术研究 [D ] . 南京:东南大学 , 2018 .
WANG Y C . Research on key technology of adaptive frequency hopping communication system [D ] . Nanjing:Southeast University , 2018 .
ZHU J S , ZHAO Z J , ZHENG S L . Intelligent anti-jamming decision algorithm of bivariate frequency hopping pattern based on DQN with PER and Pareto [J ] . International Journal of Information Technology and Web Engineering , 2022 , 17 ( 1 ): 1 - 23 .
时圣苗 , 刘全 . 采用分类经验回放的深度确定性策略梯度方法 [J ] . 自动化学报 , 2022 , 48 ( 7 ): 1816 - 1823 .
SHI S M , LIU Q . Deep deterministic policy gradient with classified experience replay [J ] . Acta Automatica Sinica , 2022 , 48 ( 7 ): 1816 - 1823 .
CANO L G , FERREIRA M , DA S S A , et al . Intelligent control of a quadrotor with proximal policy optimization reinforcement learning [C ] // Proceedings of 2018 Latin American Robotic Symposium,2018 Brazilian Symposium on Robotics (SBR) and 2018 Workshop on Robotics in Education (WRE) . Piscataway:IEEE Press , 2018 : 503 - 508 .
张浩昱 , 熊凯 . 基于近端策略优化算法的四足机器人步态控制研究 [J ] . 空间控制技术与应用 , 2019 , 45 ( 3 ): 53 - 58 .
ZHANG H Y , XIONG K . On gait control of quadruped robot based on proximal policy optimization algorithm [J ] . Aerospace Control and Application , 2019 , 45 ( 3 ): 53 - 58 .
MAYER S , CLASSEN T , ENDISCH C . Modular production control using deep reinforcement learning:proximal policy optimization [J ] . Journal of Intelligent Manufacturing , 2021 , 32 ( 8 ): 2335 - 2351 .
舒凌洲 . 基于深度强化学习的城市道路交通控制算法研究 [D ] . 成都:电子科技大学 , 2020 .
SHU L Z . Research on urban traffic control algorithm based on deep reinforcement learning [D ] . Chengdu:University of Electronic Science and Technology of China , 2020 .
GUAN Y , REN Y G , LI S E , et al . Centralized cooperation for connected and automated vehicles at intersections by proximal policy optimization [J ] . IEEE Transactions on Vehicular Technology , 2020 , 69 ( 11 ): 12597 - 12608 .
GU Y , CHENG Y H , CHEN C L P , et al . Proximal policy optimization with policy feedback [J ] . IEEE Transactions on Systems,Man,and Cybernetics:Systems , 2022 , 52 ( 7 ): 4600 - 4610 .
王鸿涛 . 基于强化学习的机械臂自学习控制 [D ] . 哈尔滨:哈尔滨工业大学 , 2019 .
WANG H T . Self learning control of mechanical arm based on reinforcement learning [D ] . Harbin:Harbin Institute of Technology , 2019 .
ZHANG L , ZHANG Y S , ZHAO X , et al . Image captioning via proximal policy optimization [J ] . Image and Vision Computing , 2021 ,108:104126.
LIN S Y , BELING P A . An end-to-end optimal trade execution framework based on proximal policy optimization [C ] // Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence . California:International Joint Conferences on Artificial Intelligence Organization , 2020 : 4548 - 4554 .
0
浏览量
321
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构