一种基于随机森林和改进卷积神经网络的网络流量分类方法

云本胜; 干潇雅; 钱亚冠

doi:10.11959/j.issn.1000-0801.2023138

您当前的位置：

首页 >

文章列表页 >

一种基于随机森林和改进卷积神经网络的网络流量分类方法

研究与开发 | 更新时间：2024-06-05

- 一种基于随机森林和改进卷积神经网络的网络流量分类方法
- A network traffic classification method based on random forest and improved convolutional neural network
- 电信科学 2023年39卷第7期页码：80-89
- 作者机构：
- 作者简介：
  
  [ "云本胜（1980- ），男，博士，浙江科技学院理学院副教授，主要研究方向为大数据分析与挖掘和机器学习" ]
  [ "干潇雅（2000- ），女，浙江科技学院理学院在读，主要研究方向为大数据分析" ]
  [ "钱亚冠（1976- ），男，博士，浙江科技学院理学院教授，主要研究方向为深度学习、人工智能安全、大数据处理" ]
- 基金信息：
  
  国家自然科学基金资助项目;The National Natural Science Foundation of China(61972357);浙江省自然科学基金资助项目;The Natural Science Foundation of Zhejiang Provincial of China(LZ22F020007)
- DOI：10.11959/j.issn.1000-0801.2023138
  中图分类号： TP393
- 网络出版日期：2023-07，
  
  纸质出版日期：2023-07-20
- 稿件说明：
移动端阅览
云本胜, 干潇雅, 钱亚冠. 一种基于随机森林和改进卷积神经网络的网络流量分类方法[J]. 电信科学, 2023,39(7):80-89.

Bensheng YUN, Xiaoya GAN, Yaguan QIAN. A network traffic classification method based on random forest and improved convolutional neural network[J]. Telecommunications science, 2023, 39(7): 80-89.
云本胜, 干潇雅, 钱亚冠. 一种基于随机森林和改进卷积神经网络的网络流量分类方法[J]. 电信科学, 2023,39(7):80-89. DOI： 10.11959/j.issn.1000-0801.2023138.

Bensheng YUN, Xiaoya GAN, Yaguan QIAN. A network traffic classification method based on random forest and improved convolutional neural network[J]. Telecommunications science, 2023, 39(7): 80-89. DOI： 10.11959/j.issn.1000-0801.2023138.

摘要

为了提高网络流量分类模型的效率、降低模型复杂度，提出了一种基于随机森林和改进卷积神经网络的分类方法。首先，利用随机森林评估了网络流量各个特征的重要性，并根据重要性排序进行特征选择；其次，采用 AdamW 优化器和三角循环学习率优化了卷积神经网络分类模型；最后，将该模型搭建在 Spark集群上实现模型训练的并行化。采用循环幅度恒定的三角循环学习率，选择1 024、400、256和100个最重要的特征作为输入的实验结果表明，模型的准确率分别提高到97.68%、95.84%、95.03%和94.22%。选择256个最重要的特征，采用不同学习率的实验结果表明，循环幅度减半的三角循环学习率的效果最佳，模型的准确率提高到95.25%，模型训练时间减少近1/2。

Abstract

In order to improve the efficiency and reduce the complexity of network traffic classification model

a classification method based on random forest and improved convolutional neural network was proposed.Firstly

the random forest was used to evaluate the importance of each feature of network traffic

and the feature was selected according to the importance ranking.Secondly

AdamW optimizer and triangular cyclic learning rate were adopted to optimize the convolutional neural network classification model.Then

the model was built on Spark cluster to realize the parallelization of model training.Adopting triangular cyclic learning rate with constant cycle amplitude

the experimental results of selecting 1 024

400

256 and 100 most important features as input show that the model accuracy is improved to 97.68%

95.84%

95.03% and 94.22%

respectively.The 256 most important features were selected and the experimental results based on adopting different learning rates show that the learning rate with half the cycle amplitude works best

the accuracy of the model is improved to 95.25%

and training time of the model is reduced by nearly half.

关键词

Keywords

references

顾玥 , 李丹 , 高凯辉 . 基于机器学习和深度学习的网络流量分类研究 [J ] . 电信科学 , 2021 , 37 ( 3 ): 105 - 113 .

GU Y , LI D , GAO K H . Research on network traffic classification based on machine learning and deep learning [J ] . Telecommunications Science , 2021 , 37 ( 3 ): 105 - 113 .

冯文博 , 洪征 , 吴礼发 , 等 . 网络协议识别技术综述 [J ] . 计算机应用 , 2019 , 39 ( 12 ): 3604 - 3614 .

FENG W B , HONG Z , WU L F , et al . Review of network protocol recognition techniques [J ] . Journal of Computer Applications , 2019 , 39 ( 12 ): 3604 - 3614 .

WANG W , ZHU M , ZENG X W , et al . Malware traffic classification using convolutional neural network for representation learning [C ] // Proceedings of 2017 International Conference on Information Networking (ICOIN) . Piscataway:IEEE Press , 2017 : 712 - 717 .

FENG W B , HONG Z , WU L F , et al . Network protocol recognition based on convolutional neural network [J ] . China Communications , 2020 , 17 ( 4 ): 125 - 139 .

SUN Y L , YUN B S , QIAN Y G , et al . A Spark-based method for identifying large-scale network burst traffic [J ] . Journal of Computers , 2021 , 32 ( 4 ): 123 - 136 .

TONG V , TRAN H A , SOUIHI S , et al . A novel QUIC traffic classifier based on convolutional neural networks [C ] // Proceedings of 2018 IEEE Global Communications Conference (GLOBECOM) . Piscataway:IEEE Press , 2019 : 1 - 6 .

HU X Y , GU C X , WEI F S . CLD-net:a network combining CNN and LSTM for Internet encrypted traffic classification [J ] . Security and Communication Networks , 2021 : 1 - 15 .

于帅 , 董育宁 , 邱晓晖 . 一种基于深度特征融合的网络流量分类方法 [J ] . 南京邮电大学学报(自然科学版) , 2022 , 42 ( 3 ): 82 - 89 .

YU S , DONG Y N , QIU X H . A network traffic classification method based on deep feature fusion [J ] . Journal of Nanjing University of Posts and Telecommunications (Natural Science) , 2022 , 42 ( 3 ): 82 - 89 .

薛靖靓 , 陈迎春 , 李鸥 . 未知流量数据的智能特征提取与实时分类识别算法 [J ] . 信息工程大学学报 , 2021 , 22 ( 5 ): 597 - 605 .

XUE J L , CHEN Y C , LI O . Intelligent feature extraction and real-time identification algorithm for unknown traffic data [J ] . Journal of Information Engineering University , 2021 , 22 ( 5 ): 597 - 605 .

MARÍ G , CAASAS P , CAPDEHOURAT G . DeepMAL - deep learning models for malware traffic detection and classification [C ] // Data Science – Analytics and Applications . Wiesbaden:Springer Vieweg , 2021 : 105 - 112 .

REIS B , MAIA E , PRAÇA I . Selection and performance analysis of CICIDS2017 features importance [C ] // International Symposium on Foundations and Practice of Security . Cham:Springer , 2020 : 56 - 71 .

BREIMAN L . Random forests [J ] . Machine Learning , 2001 , 45 ( 1 ): 5 - 32 .

陈卓 , 吕娜 . 基于随机森林和XGBoost的网络入侵检测模型 [J ] . 信号处理 , 2020 , 36 ( 7 ): 1055 - 1064 .

CHEN Z , LYU N . Network intrusion detection model based on random forest and XGBoost [J ] . Journal of Signal Processing , 2020 , 36 ( 7 ): 1055 - 1064 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C ] // Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2016 : 770 - 778 .

甘众远 . 基于深度学习的轻量化恶意流量识别及其分布式方法的研究与实现 [D ] . 南京:南京邮电大学 , 2021 .

GAN Z Y . Research and implementation of lightweight malicious traffic identification and its distributed method based on deep learning [D ] . Nanjing:Nanjing University of Posts and Telecommunications , 2021 .

LOSHCHILOV I , HUTTER F . Decoupled weight decay regularization [J ] . arXiv preprint , 2017 ,arXiv:1711.05101.

KINGMA D P , BA J . Adam:a method for stochastic optimization [J ] . arXiv preprint ， 2014 ,arXiv:1412.6980.

刘云飞 , 张俊然 . 深度神经网络学习率策略研究进展 [J ] . 控制与决策 , 2022 :0147.

LIU Y F , ZHANG J R . Research advances in deep neural networks learning rate strategies [J ] . Control and Decision , 2022 :0147.

浏览量

218

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于深度学习的6G可见光通信多址接入解调方法

一种基于编码单元快速划分的VVC帧内编码方法

采用恒Q调制包络的合成语音伪装检测方法

基于优化卷积神经网络的车辆特征识别算法研究

基于中心对称局部二值模式的合成伪装语音检测方法