基于价值差异学习的多小区mMTC接入算法

李昕; 孙君

doi:10.11959/j.issn.1000-0801.2022152

您当前的位置：

首页 >

文章列表页 >

基于价值差异学习的多小区mMTC接入算法

研究与开发 | 更新时间：2024-06-05

- 基于价值差异学习的多小区mMTC接入算法
- Value-difference learning based mMTC devices access algorithm in multi-cell network
- 电信科学 2022年38卷第6期页码：82-90
- 作者机构：
  
  1. 南京邮电大学通信与信息工程学院，江苏南京 210003
  2. 江苏省无线通信重点实验室，江苏南京 210003
- 作者简介：
  
  [ "李昕（1997- ），女，南京邮电大学通信与信息工程学院硕士生，主要研究方向为大连接物联网设备的随机接入" ]
  [ "孙君（1980- ），女，南京邮电大学副研究员、硕士生导师，主要研究方向为无线网络、无线资源管理和物联网" ]
- 基金信息：
  
  国家自然科学基金资助项目;The National Natural Science Foundation of China(61771255);省部级重点实验室开放课题项目;Provincial and Ministerial Key Laboratory Open Project(20190904)
- DOI：10.11959/j.issn.1000-0801.2022152
  中图分类号： TN929.5
- 网络出版日期：2022-06，
  
  纸质出版日期：2022-06-20
- 稿件说明：
移动端阅览
李昕, 孙君. 基于价值差异学习的多小区mMTC接入算法[J]. 电信科学, 2022,38(6):82-90.

Xin LI, Jun SUN. Value-difference learning based mMTC devices access algorithm in multi-cell network[J]. Telecommunications science, 2022, 38(6): 82-90.
李昕, 孙君. 基于价值差异学习的多小区mMTC接入算法[J]. 电信科学, 2022,38(6):82-90. DOI： 10.11959/j.issn.1000-0801.2022152.

Xin LI, Jun SUN. Value-difference learning based mMTC devices access algorithm in multi-cell network[J]. Telecommunications science, 2022, 38(6): 82-90. DOI： 10.11959/j.issn.1000-0801.2022152.

摘要

在5G大连接物联网场景下，针对大连接物联网设备（massive machine type communication device， mMTCD）的接入拥塞现象，提出了基于价值差异探索的双重深度Q网络（double deep Q network with value-difference based exploration，VDBE-DDQN）算法。该算法着重解决了在多小区网络环境下mMTCD接入基站的问题，并将该深度强化算法的状态转移过程建模为马尔可夫决策过程。该算法使用双重深度Q网络来拟合目标状态—动作值函数，并采用基于价值差异的探索策略，可以同时利用当前条件和预期的未来需求来应对环境变化，每个mMTCD根据当前值函数与网络估计的下一时刻值函数的差异来更新探索概率，而不是使用统一的标准，从而为mMTCD选择最佳基站。仿真结果表明，所提算法可有效提高系统的接入成功率。

Abstract

In the massive machine type communication scenario of 5G

the access congestion problem of massive machine type communication devices (mMTCD) in multi-cell network is very important.A double deep Q network with value-difference based exploration (VDBE-DDQN) algorithm was proposed.The algorithm focused on the solution that could reduce the collision when a number of mMTCDs accessed to eNB in multi-cell network.The state transition process of the deep reinforcement learning algorithm was modeled as Markov decision process.Furthermore

the algorithm used a double deep Q network to fit the target state-action value function

and it employed an exploration strategy based on value-difference to adapt the change of the environment

which could take advantage of both current conditions and expected future needs.Moreover

each mMTCD updated the probability of exploration according to the difference between the current value function and the next value function estimated by the network

rather than using the same standard to select the best base eNB for the mMTCD.Simulation results show that the proposed algorithm can effectively improve the access success rate of the system.

关键词

Keywords

references

TULLBERG H , POPOVSKI P , LI Z X , et al . The METIS 5G system concept:meeting the 5G requirements [J ] . IEEE Communications Magazine , 2016 , 54 ( 12 ): 132 - 139 .

Latva-aho M , Leppänen K , Clazzer F , et al . Key drivers and research challenges for 6G ubiquitous wireless intelligence [J ] . 2016 .

BI Q . Ten trends in the cellular industry and an outlook on 6G [J ] . IEEE Communications Magazine , 2019 , 57 ( 12 ): 31 - 36 .[LinkOut ]

董石磊 , 赵婧博 . 面向工业场景的 5G 专网解决方案研究 [J ] . 电信科学 , 2021 , 37 ( 11 ): 97 - 103 .

DONG S L , ZHAO J B . Research on 5G private networking schemes for industry [J ] . Telecommunications Science , 2021 , 37 ( 11 ): 97 - 103 .

POPLI S , JHA R K , JAIN S . A survey on energy efficient narrowband internet of things (NBIoT):architecture,application and challenges [J ] . IEEE Access , 2018 ( 7 ): 16739 - 16776 .

NAVARRO-ORTIZ J , ROMERO-DIAZ P , SENDRA S , et al . A survey on 5G usage scenarios and traffic models [J ] . IEEE Communications Surveys ＆ Tutorials , 2020 , 22 ( 2 ): 905 - 929 .

ANALYTICS S . Number of Internet of things(IoT) connected devices worldwide in 2018,2025 and 2030(in billions) [J ] . Statista Inc , 2020 ,( 7 ): 17 .

SHARMA S K , WANG X B . Toward massive machine type communications in ultra-dense cellular IoT networks:current issues and machine learning-assisted solutions [J ] . IEEE Communications Surveys ＆ Tutorials , 2020 , 22 ( 1 ): 426 - 471 .

3GPP . Study on RAN improvements for machine-type communications:TR 37.868 [R ] . 2011 .

ALI M S , HOSSAIN E , KIM D I . LTE/LTE-A random access for massive machine-type communications in smart cities [J ] . IEEE Communications Magazine , 2017 , 55 ( 1 ): 76 - 83 .

SHARMA S K , WANG X B . Collaborative distributed Q-learning for RACH congestion minimization in cellular IoT networks [J ] . IEEE Communications Letters , 2019 , 23 ( 4 ): 600 - 603 .

DA SILVA M V , SOUZA R D , ALVES H , et al . A NOMA-based Q-learning random access method for machine type communications [J ] . IEEE Wireless Communications Letters , 2020 , 9 ( 10 ): 1720 - 1724 .

TSOUKANERI G , WU S B , WANG Y . Probabilistic preamble selection with reinforcement learning for massive machine type communication (MTC) devices [C ] // Proceedings of 2019 IEEE 30th Annual International Symposium on Personal,Indoor and Mobile Radio Communications . Piscataway:IEEE Press , 2019 : 1 - 6 .

PACHECO-PARAMO D , TELLO-OQUENDO L . Adjustable access control mechanism in cellular MTC networks:a double Q-learning approach [C ] // Proceedings of 2019 IEEE Fourth Ecuador Technical Chapters Meeting . Piscataway:IEEE Press , 2019 : 1 - 6 .

BAI J N , SONG H , YI Y , et al . Multiagent reinforcement learning meets random access in massive cellular Internet of Things [J ] . IEEE Internet of Things Journal , 2021 , 8 ( 24 ): 17417 - 17428 .

MOHAMMED A H , KHWAJA A S , ANPALAGAN A , et al . Base Station selection in M2M communication using Q-learning algorithm in LTE-A networks [C ] // Proceedings of 2015 IEEE 29th International Conference on Advanced Information Networking and Applications . Piscataway:IEEE Press , 2015 : 17 - 22 .

LEE D , ZHAO Y , LEE J . Reinforcement learning for random access in multi-cell networks [C ] // Proceedings of 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC) . Piscataway:IEEE Press , 2021 : 335 - 338 .

MOON J , LIM Y . Access control of MTC devices using reinforcement learning approach [C ] // Proceedings of 2017 International Conference on Information Networking (ICOIN) . Piscataway:IEEE Press , 2017 : 641 - 643 .

LIEN S Y , CHEN K C , LIN Y H . Toward ubiquitous massive accesses in 3GPP machine-to-machine communications [J ] . IEEE Communications Magazine , 2011 , 49 ( 4 ): 66 - 74 .

VAN HASSELT H , GUEZ A , SILVER D . Deep reinforcement learning with double q-learning [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . 2016 : 2094 - 2100 .

SILVER D , HUANG A , MADDISON C J , et al . Mastering the game of Go with deep neural networks and tree search [J ] . Nature , 2016 , 529 ( 7587 ): 484 - 489 .

TIELEMAN T , HINTON G . Lecture 6.5-rmsprop:divide the gradient by a running average of its recent magnitude [J ] . COURSERA:Neural networks for machine learning , 2012 , 4 ( 2 ): 26 - 31 .

浏览量

233

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于OS-MBRL的网络切片资源动态分配方法研究

大语言模型对齐研究综述

基于策略约束强化学习的算网多目标优化研究

基于优化决策树的时延敏感流智能感知调度

基于艾宾浩斯遗忘曲线和注意力机制的推荐算法