浏览全部资源
扫码关注微信
1. 中国信息通信研究院 北京 100191
2. 北京邮电大学 北京 100876
[ "韩秉君,男,博士,中国信息通信研究院标准工程师,主要研究方向为移动通信领域系统级仿真平台构建,在干扰共存、系统级仿真平台加速技术等方面有较深积累。" ]
[ "黄诗铭,男,北京邮电大学硕士生,主要研究方向为通信系统级仿真平台构建,对GPU加速、通信模型并行化处理方面有较深积累。" ]
[ "杜滢,女,中国信息通信研究院高级工程师,主要从事无线通信技术研究、标准化和评估工作。作为主要成员参与3GPP LTE、LTE-Advanced 技术研究和国际标准化工作,目前负责LTE R13 国际标准化制定和5G国际标准化预研工作。" ]
网络出版日期:2015-10,
纸质出版日期:2015-10-20
移动端阅览
韩秉君, 黄诗铭, 杜滢. 一种基于Kepler架构GPU的通信仿真加速方法[J]. 电信科学, 2015,31(10):82-88.
Bingjun Han, Shiming Huang, Ying Du. A Simulation Accelerating Method Based on CUDA with Kepler GPU[J]. Telecommunications science, 2015, 31(10): 82-88.
韩秉君, 黄诗铭, 杜滢. 一种基于Kepler架构GPU的通信仿真加速方法[J]. 电信科学, 2015,31(10):82-88. DOI: 10.11959/j.issn.1000-0801.2015248.
Bingjun Han, Shiming Huang, Ying Du. A Simulation Accelerating Method Based on CUDA with Kepler GPU[J]. Telecommunications science, 2015, 31(10): 82-88. DOI: 10.11959/j.issn.1000-0801.2015248.
提出了一种在 Kepler 架构 GPU(graphics processing unit,图形处理器)上利用 CUDA(compute unified device architecture,统一计算设备架构)技术加速通信仿真中DFT(discrete Fourier transform,离散傅里叶变换)处理过程的方法。该方法的核心思想是利用线程级并行技术实现单条收发链路内部DFT运算的并行加速,并利用动态并行和Hyper-Q技术实现不同收发用户对之间链路处理过程的并行加速,从而最终达到加速仿真中DFT处理过程的目的。实验结果表明,相对单核单线程CPU程序和上一代Fermi架构GPU程序,该方法分别能够将DFT处理速度提升300倍和3倍,具有较好的加速效果。
An accelerating method based on CUDA(compute unified device architecture)with Kepler GPU(graphics processing unit)was proposed to speed up the DFT(discrete Fourier transform)processing in the communication simulation platform.Based on this method,the whole DFT processing was split into subtasks named molecular-subtasks corresponding to communication links and a molecular-subtask was further split into smaller parallel subtasks named atomic-subtasks which correspond to the DFT processing in a link.Then,the atomic-subtasks were processed in parallel by the threads in a GPU kernel function,as well as the molecular-subtasks were processed in parallel via several GPU kernel functions to shorter the simulation time.Simulation results show this method can speed up the DFT processing more than 300 times compared with single thread CPU program and 3 times compared with traditional GPU program.
NVIDIA Corporation CUDA toolkit documentation v7.5 . http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf , 2015
NVIDIA Corporation Nvidia kepler GK110 next-generation CUDA compute architecture . http://www.nvidia.com/content/PDF/kepler/NV_DS_Tesla_KCompute_Arch_May_2012_LR.pdf http://www.nvidia.com/content/PDF/kepler/NV_DS_Tesla_KCompute_Arch_May_2012_LR.pdf , 2012
Abdelrazek A F , Kaschub M , Blankenhorn C , et al . A novel architecture using NVIDIA CUDA to speed up simulation of multi-path fast fading channels . Proceedings of the 69th IEEE Vehicular Technology Conference , Barcelona,Spain , 2009
Laguna-Sanchez G A , Prieto-Guerrero A , Rodriguez-Colina E . Speedup simulation for OFDM over PLC channel using a multithreading GPU . Proceedings of IEEE Latin-American Conference on Communications (LATINCOM) , Belem,Brazil , 2011
Potluri S , Wang H , Bureddy D , et al . Optimizing MPI communication on multi-GPU systems using CUDA inter-process communication . Proceedings of the 26th IEEE International on Parallel and Distributed Processing Symposium Workshops & phD Forum(IPDPSW) , Shanghai,China , 2012 : 1848 ~ 1857
Wu J , JaJa J , Balaras E . An optimized FFT-based direct Poisson solver on CUDA GPUs . IEEE Transactions on Parallel and Distributed Systems , 2014 ( 1 ): 550 ~ 559
Beermann M , Monro E , Schmalen H , et al . High speed decoding of non-binary irregular LDPC codes using GPUs . Proceedings of IEEE Workshop on Signal Processing System (SiPS) , Taipei,China , 2013
Rodriguez A , Valverde J , Torre E , et al . Dynamic management of multikernel multithread accelerators using dynamic partial reconfiguration . Proceedings of the 9th International Symposium on Reconfigurable and Communication-Cenric Systems-on-Chip (ReCoSoC) , Montpellier,France , 2014
Proakis J G . Digital Signal Processing , 4th Revised Edition London:Pearson Prentice Hall , 2009 : 105 ~ 129
Noga A , Topa T . Kernel execution strategies for GPU-accelerated version of method of moments . Proceedings of the 20th International Conference on Microwaves,Radar,and Wireless Communication(MIKON) , Gdansk,Poland , 2014
Wilt N , The CUDA Handbook . Upper Saddle River:Addison-Wesley , 2013
Bilel B R , Navid N . Cunetsim:a GPU based simulation testbed for large scale mobile networks . Proceedings of International Conference on Communications and Information Technology (ICCIT) , Hammamet,Tunisia , 2012
0
浏览量
572
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构