1.中国移动通信集团有限公司研究院,北京 100053
2.西北农林科技大学信息工程学院,陕西 杨凌 712199
[ "和红顺(1990- ),男,博士,中国移动通信集团有限公司研究院前沿技术研究员,主要研究方向为机器学习、业务识别、计算机视觉、6G创新业务相关技术。" ]
[ "胡国良(1991- ),男,博士,西北农林科技大学信息工程学院讲师,主要研究方向为计算机体系结构、计算机双目视觉、图像处理、业务识别。" ]
[ "张志鹏(1972- ),男,博士,中国移动通信有限公司研究院高级主任研究员,主要研究方向为机器学习、计算机视觉、AI智慧工业关键技术与产品创新。" ]
[ "柴鑫刚(1976- ),男,中国移动通信有限公司研究院高级工程师、业务所技术经理,主要研究方向为视联网、计算机视觉、6G业务识别及创新业务关键技术与产品创新。" ]
[ "高静(1976- ),女,中国移动通信有限公司研究院项目经理,主要研究方向为大视频AI技术、6G沉浸媒体业务关键技术与产品创新。" ]
修回:2025-07-21,
录用:2025-08-06,
网络出版:2026-01-06,
移动端阅览
和红顺,胡国良,张志鹏等.一种基于证据融合的类不平衡分类方法及其在网络流量识别中的应用[J].电信科学,
HE Hongshun,HU Guoliang,ZHANG Zhipeng,et al.A combined imbalanced classification approach based on D-S Evidence Theory and its application in network traffic recognition[J].Telecommunications Science,
和红顺,胡国良,张志鹏等.一种基于证据融合的类不平衡分类方法及其在网络流量识别中的应用[J].电信科学, DOI:10.11959/j.issn.1000−0801.2026025.
HE Hongshun,HU Guoliang,ZHANG Zhipeng,et al.A combined imbalanced classification approach based on D-S Evidence Theory and its application in network traffic recognition[J].Telecommunications Science, DOI:10.11959/j.issn.1000−0801.2026025.
类不平衡分类问题是机器学习中常见的挑战之一,且广泛存在于网络流量识别等实际场景应用中。针对类不平衡分类问题,设计了一种基于证据理论的融合类不平衡分类算法,通过使用不同的欠采样和过采样分类算法进行建模,利用多属性决策方法将多组不同的评价输出转换成证据函数,使用证据组合规则融合得到最终的识别结果。基于人工合成数据集和UCI基准数据集,采用神经网络与随机森林分类器进行交叉验证,并应用于网络流量识别任务。实验结果表明,所提出的算法能更好地应对类不平衡分类问题,在召回率、
F
1-score和
G
-mean等评价指标上均取得显著提升。
The imbalanced classification problem is one of the common challenges in machine learning and widely exists in practical applications such as network traffic recognition. To address this issue
a combined imbalanced classification approach based on D-S Evidence Theory was proposed. Different undersampling and oversampling classification algorithms were used for modeling
and multiple attribute decision making methods were used to convert different evaluation outputs into mass functions. Finally
evidence combination rules were used to combine the mass functions and obtain the final recognition results. Validation experiments were conducted on synthetic datasets and UCI benchmark datasets using neural network classifiers and random forest classifiers. The validated framework was then applied to real-world network traffic identification tasks. The experimental results demonstrated that the proposed approach significantly improved performance in addressing class-imbalanced classification problems
achieving notable enhancements in key evaluation metrics such as recall rate
F
1-score
and
G
-mean value.
ZHOU D D , XU Q , WANG J , et al . Alleviating class imbalance problem in automatic sleep stage classification [J ] . IEEE Transactions on Instrumentation and Measurement , 2022 , 71 : 4006612 .
DONG S , XIA Y J . Network traffic identification in packet sampling environment [J ] . Digital Communications and Networks , 2023 , 9 ( 4 ): 957 - 970 .
DONG S , XIA Y J , PENG T . Traffic identification model based on generative adversarial deep convolutional network [J ] . Annals of Telecommunications , 2022 , 77 ( 9 ): 573 - 587 .
DONG S , XIA Y J , WANG T . Network abnormal traffic detection framework based on deep reinforcement learning [J ] . IEEE Wireless Communications , 2024 , 31 ( 3 ): 185 - 193 .
阿克弘 , 胡晓东 . 基于GAN数据重构的电信用户流失预测方法 [J ] . 电信科学 , 2023 , 39 ( 3 ): 135 - 142 .
A K H , HU X D . GAN data reconstruction based prediction method of telecom subscriber loss [J ] . Telecommunications Science , 2023 , 39 ( 3 ): 135 - 142 .
余立 , 李哲 , 高飞 , 等 . 改进自训练模型在业务质差用户识别中的应用 [J ] . 电信科学 , 2021 , 37 ( 10 ): 136 - 142 .
YU L , LI Z , GAO F , et al . Application of improved self-training model in the identification of users with poor service quality [J ] . Telecommunications Science , 2021 , 37 ( 10 ): 136 - 142 .
HART P . The condensed nearest neighbor rule (Corresp.) [J ] . IEEE Transactions on Information Theory , 1968 , 14 ( 3 ): 515 - 516 .
KUBAT M , MATWIN S . Addressing the curse of imbalanced training sets: one sided selection [C ] // Proceedings of the 14th International Conference on Machine Learning . Nashville : Morgan Kaufmann , 1997 : 179 - 186 .
LAURIKKALA J . Improving identification of difficult small classes by balancing class distribution [C ] // Artificial Intelligence in Medicine . Berlin, Heidelberg : Springer , 2001 : 63 - 66 .
LIN W C , TSAI C F , HU Y H , et al . Clustering-based undersampling in class-imbalanced data [J ] . Information Sciences , 2017 , 409 : 17 - 26 .
CHAWLA N V , BOWYER K W , HALL L O , et al . SMOTE: synthetic minority over-sampling technique [J ] . Journal of Artificial Intelligence Research , 2002 , 16 : 321 - 357 .
HAN H , WANG W Y , MAO B H . Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning [M ] // Advances in Intelligent Computing . Berlin, Heidelberg : Springer Berlin Heidelberg , 2005 : 878 - 887 .
BUNKHUMPORNPAT C , SINAPIROMSARAN K , LURSINSAP C . Safe-level-SMOTE: safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem [M ] // Advances in Knowledge Discovery and Data Mining . Berlin, Heidelberg : Springer Berlin Heidelberg , 2009 : 475 - 482 .
ENGELMANN J , LESSMANN S . Conditional Wasserstein GAN-based oversampling of tabular data for imbalanced learning [J ] . Expert Systems with Applications , 2021 , 174 : 114582 .
SUSAN S , KUMAR A . SSO maj-SMOTE-SSO Min: three-step intelligent pruning of majority and minority samples for learning from imbalanced datasets [J ] . Applied Soft Computing , 2019 , 78 : 141 - 149 .
CHEN H M , LI T R , FAN X , et al . Feature selection for imbalanced data based on neighborhood rough sets [J ] . Information Sciences , 2019 , 483 : 1 - 20 .
MORENO-TORRES J G , HERRERA F . A preliminary study on overlapping and data fracture in imbalanced domains by means of Genetic Programming-based feature extraction [C ] // Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications . Piscataway : IEEE Press , 2010 : 501 - 506 .
SALEKSHAHREZAEE Z , LEEVY J L , KHOSHGOFTAAR T M . Feature extraction for class imbalance using a convolutional autoencoder and data sampling [C ] // Proceedings of the 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI) . Piscataway : IEEE Press , 2021 : 217 - 223 .
NG W W Y , ZENG G J , ZHANG J J , et al . Dual autoencoders features for imbalance classification problem [J ] . Pattern Recognition , 2016 , 60 : 875 - 889 .
SAHIN Y , BULKAN S , DUMAN E . A cost-sensitive decision tree approach for fraud detection [J ] . Expert Systems with Applications , 2013 , 40 ( 15 ): 5916 - 5923 .
CAO P , ZHAO D Z , ZAIANE O . An optimized cost-sensitive SVM for imbalanced data learning [M ] // Advances in Knowledge Discovery and Data Mining . Berlin, Heidelberg : Springer Berlin Heidelberg , 2013 : 280 - 292 .
ARAR Ö F , AYAN K . Software defect prediction using cost-sensitive neural network [J ] . Applied Soft Computing , 2015 , 33 : 263 - 277 .
周志华 . 机器学习 [M ] . 北京 : 清华大学出版社 , 2016 .
ZHOU Z H . Machine learning [M ] . Beijing : Tsinghua University Press , 2016 .
BAUER E , KOHAVI R . An empirical comparison of voting classification algorithms: bagging, boosting, and variants [J ] . Machine Learning , 1999 , 36 ( 1 ): 105 - 139 .
HASTIE T , ROSSET S , ZHU J , et al . Multi-class AdaBoost [J ] . Statistics and Its Interface , 2009 , 2 ( 3 ): 349 - 360 .
FRIEDMAN J H . Stochastic gradient boosting [J ] . Computational Statistics & Data Analysis , 2002 , 38 ( 4 ): 367 - 378 .
DATSI T , AZNAG K , BENALI B A , et al . A short survey on multimodal data fusion in image classification [C ] // Proceedings of the 2024 International Conference on Global Aeronautical Engineering and Satellite Technology (GAST) . Piscataway : IEEE Press , 2024 : 1 - 4 .
CHAWLA N V , LAZAREVIC A , HALL L O , et al . SMOTEBoost: improving prediction of the minority class in boosting [M ] // Knowledge Discovery in Databases: PKDD 2003 . Berlin, Heidelberg : Springer Berlin Heidelberg , 2003 : 107 - 119 .
LIU X Y , WU J X , ZHOU Z H . Exploratory undersampling for class-imbalance learning [J ] . IEEE Transactions on Systems, Man, and Cybernetics Part B , Cybernetics, 2009 , 39 ( 2 ): 539 - 550 .
SHAFER G . A mathematical theory of evidence [M ] . Princeton : Princeton University Press , 1976 .
SMETS P . Data fusion in the transferable belief model [C ] // Proceedings of the Third International Conference on Information Fusion . Piscataway : IEEE Press , 2000 : PS21-PS33.
SMARANDACHE F , DEZERT J . On the consistency of PCR6 with the averaging rule and its application to probability estimation [C ] // Proceedings of the 16th International Conference on Information Fusion . Piscataway : IEEE Press , 2013 : 1119 - 1126 .
TACNET J M , DEZERT J . Cautious OWA and evidential reasoning for decision making under uncertainty [C ] // Proceedings of the 14th International Conference on Information Fusion . Piscataway : IEEE Press , 2011 : 1 - 8 .
Fisher R A , Forina M , Asuncion A , et al . UCI machine learning repository [EB ] . 1987 .
HETTICH S , BAY S D . KDD cup 1999 [EB ] . 2007 .
0
浏览量
2
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621