浏览全部资源
扫码关注微信
1.中国联合网络通信集团有限公司,北京 100033
2.北京邮电大学网络与交换技术全国重点实验室,北京 100876
3.新讯数字科技有限公司,北京 100091
[ "张蕾(1978- ),女,中国联合网络通信集团有限公司高级工程师,主要研究方向为通信网络、云网融合、人工智能等。" ]
[ "靖宇涵(1995- ),女,北京邮电大学网络与交换技术全国重点实验室博士生,新讯数字科技有限公司技术专家,主要研究方向为AIOps、时间序列分析、异常检测和故障定位。" ]
[ "何波(1995- ),男,博士,北京邮电大学网络与交换技术全国重点实验室博士后研究员,主要研究方向为5G/6G网络、多路径网络、集体通信、传输控制和深度强化学习。" ]
[ "戚琦(1982- ),女,博士,北京邮电大学网络与交换技术全国重点实验室教授、博士生导师,主要研究方向为智能边缘计算、轻量级神经网络、业务网络智能化等。" ]
[ "陈晨(1986- ),男,新讯数字科技有限公司架构师,主要研究方向为智能运维、大语言模型、数字人关键技术的应用。" ]
[ "王敬宇(1978- ),男,博士,北京邮电大学网络与交换技术全国重点实验室教授、博士生导师,主要研究方向为智能网络、人工智能、多媒体通信。" ]
收稿日期:2023-12-30,
修回日期:2024-03-10,
纸质出版日期:2024-05-20
移动端阅览
张蕾,靖宇涵,何波等.网络服务异常事件告警因果图构造方法[J].电信科学,2024,40(05):152-164.
ZHANG Lei,JING Yuhan,HE Bo,et al.A method of building alarm causality graph for anomaly events in network services[J].Telecommunications Science,2024,40(05):152-164.
张蕾,靖宇涵,何波等.网络服务异常事件告警因果图构造方法[J].电信科学,2024,40(05):152-164. DOI: 10.11959/j.issn.1000-0801.2024091.
ZHANG Lei,JING Yuhan,HE Bo,et al.A method of building alarm causality graph for anomaly events in network services[J].Telecommunications Science,2024,40(05):152-164. DOI: 10.11959/j.issn.1000-0801.2024091.
网络服务系统中,异常事件的发生经常导致系统中产生大量告警事件,形成告警风暴。运维人员需要花费大量的时间和精力从这些告警数据中寻找关键信息、确定异常事件的根源。为了减少运维人员所需处理的告警数量,智能化、自动化地提取告警风暴中的根源告警,基于网络服务告警的传播模式分析,提出了一种告警因果图构造方法,并将其应用于提取异常事件发生时的告警风暴关键信息。实验使用运营商现网管理系统的真实数据集,通过告警风暴摘要提取实验,验证了告警因果图生成的效果,并进行了相关案例的物理意义分析。结果表明,使用告警因果图生成的方式进行告警风暴摘要提取,达到了96%的召回率,保留了绝大部分关键信息。同时,使用该方法对系统产生的告警进行压缩,对较难压缩的告警码的压缩率能够达到66.5%。
In network service systems
the occurrence of anomaly events often leads to a large number of alarm events in the system
forming alarm storms. Operators need to spend a lot of time and effort searching for key information and identifying the root cause of anomaly events from these alarm data. In order to reduce the number of alarms that operators needed to handle
as well as automatically extracted the root alarms in the alarm storm
a method for generating an alarm causality graph based on the analysis of the propagation mode of network service alarms was proposed
and applied to extract key information of the alarm storm when anomaly events occurred. Real datasets of an operator's online network management system were used in experiments to verify the effect of building the alarm causal graph in extracting the alarm storm abstract. A real-world case was used to analyze the physical significance of this method. The results show that the recall rate of extracting alarm storm summary can reach 96% and the vast majority of key information is retained by using the method of alarm causality graph generation. In addition
the compression rate of alarms using this method can reach 66.5% for alarm codes that are difficult to compress.
戚琦 , 申润业 , 王敬宇 . GAD:基于拓扑感知的时间序列异常检测 [J ] . 通信学报 , 2020 , 41 ( 6 ): 152 - 160 .
QI Q , SHEN R Y , WANG J Y . GAD: topology-aware time series anomaly detection [J ] . Journal on Communications , 2020 , 41 ( 6 ): 152 - 160 .
周雪峰 , 徐强 , 谭艳婷 , 等 . 基于改进灰色聚类算法的云架构数据中心网络异常流量过滤算法 [J ] . 电信科学 , 2023 , 39 ( 7 ): 90 - 98 .
ZHOU X F , XU Q , TAN Y T , et al . Cloud architecture data center network abnormal traffic filtering algorithm based on improved grey clustering algorithm [J ] . Telecommunications Science , 2023 , 39 ( 7 ): 90 - 98 .
曹祺 . 基于开关量信息的电网故障诊断及保护告警方法研究 [D ] . 北京 : 华北电力大学(北京) , 2023 .
CAO Q . Research on fault diagnosis and protection alarm method based on switch quantity information in power grid [D ] . Beijing : North China Electric Power University (Beijing) , 2023 .
杨敏 . 5G支撑网告警数据的故障定位方法 [J ] . 移动通信 , 2022 , 46 ( 12 ): 120 - 128 .
YANG M . Fault location method for alarm data of 5G supporting network [J ] . Mobile Communications , 2022 , 46 ( 12 ): 120 - 128 .
ZHAO N , CHEN J , PENG X , et al . Understanding and handling alert storm for online service systems [C ] // Proceedings of the 42nd International Conference on Software Engineering , Software Engineering in Practice . New York : ACM Press , 2020 : 162 - 171 .
刘波 , 万维威 , 邹大均 , 等 . 基于聚类算法的电网告警数据分析与处理模型 [J ] . 通信技术 , 2023 , 56 ( 7 ): 915 - 922 .
LIU B , WAN W W , ZOU D J , et al . Analysis and processing model of power grid alarm data based on clustering algorithm [J ] . Communications Technology , 2023 , 56 ( 7 ): 915 - 922 .
ZHAO N W , JIN P S , WANG L X , et al . Automatically and adaptively identifying severe alerts for online service systems [C ] // Proceedings of the IEEE INFOCOM 2020 - IEEE Conference on Computer Communications . Piscataway : IEEE Press , 2020 : 2420 - 2429 .
GARCIA A J , TORIL M , OLIVER P , et al . Automatic alarm prioritization by data mining for fault management in cellular networks [J ] . Expert Systems with Applications , 2020 , 158 : 113526 .
CHEN T Q , GUESTRIN C . XGBoost: a scalable tree boosting system [C ] // Proceedings of the Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . New York : ACM Press , 2016 : 785 - 794 .
SU Y , ZHAO Y J , XIA W T , et al . CoFlux: robustly correlating KPIs by fluctuations for service troubleshooting [C ] // Proceedings of the 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS) . Piscataway : IEEE Press , 2019 : 1 - 10 .
LIU S G , XIE J , ZHAO Z C , et al . Extraction method of alarm transaction based on morphology similarity clustering [C ] // Proceedings of the 2019 IEEE 15th International Conference on Control and Automation (ICCA) . Piscataway : IEEE Press , 2019 : 917 - 921 .
FAHIMIPIREHGALIN M , WEISS I , VOGEL-HEUSER B . Causal inference in industrial alarm data by timely clustered alarms and transfer entropy [C ] // Proceedings of the 2020 European Control Conference (ECC) . Piscataway : IEEE Press , 2020 : 2056 - 2061 .
MELNYCHUK O , BEKHTA I , TKACHIVSKA M . Pearson correlation coefficient in studying the meaning of a literary [C ] // Proceedings of the 7th International Conference on Computational Linguistics and Intelligent Systems . Kharkiv, Ukraine : CEUR-WS.org , 2023 : 460 - 477 .
SALLER D , KUMOVA B I , HENNEBOLD C . Detecting Causalities in Production Environments Using Time Lag Identification with Cross-Correlation in Production State Time Series [C ] /// Proceedings of the Artificial Intelligence and Soft Computing-19th International Conference . Zakopane, Poland : Springer , 2020 : 243 - 252 .
丁宏 , 周宏林 . 基于机器学习的通信网告警关联分析综述 [J ] . 东方电气评论 , 2021 , 35 ( 1 ): 77 - 84, 88 .
DING H , ZHOU H L . Survey of alarm correlation analysis for communication networks based on machine learning [J ] . Dongfang Electric Review , 2021 , 35 ( 1 ): 77 - 84, 88 .
毛伊敏 , 邓千虎 , 陈志刚 . 基于信息熵与遗传算法的并行关联规则增量挖掘算法 [J ] . 通信学报 , 2021 , 42 ( 5 ): 122 - 136 .
MAO Y M , DENG Q H , CHEN Z G . Parallel association rules incremental mining algorithm based on information entropy and genetic algorithm [J ] . Journal on Communications , 2021 , 42 ( 5 ): 122 - 136 .
闫利霞 , 凌兴宏 , 尼洪涛 . 基于Apriori算法的混合型数据频繁项集挖掘算法 [J ] . 计算机仿真 , 2023 , 40 ( 12 ): 538 - 542 .
YAN L X , LING X H , NI H T . Hybrid data frequent itemset mining algorithm based on Apriori algorithm [J ] . Computer Simulation , 2023 , 40 ( 12 ): 538 - 542 .
SPIRTES P , GLYMOUR C . An algorithm for fast recovery of sparse causal graphs [J ] . Social Science Computer Review , 1991 , 9 ( 1 ): 62 - 72 .
KALISCH M , BÜHLMANN P . Estimating high-dimensional directed acyclic graphs with the PC-algorithm [J ] . Journal of Machine Learning Research , 2007 , 8 : 613 - 636 .
SONDHI A , SHOJAIE A . The reduced PC-algorithm: improved causal structure learning in large random networks [J ] . Journal of Machine Learning Research , 2019 , 20 ( 164 ): 1 - 31 .
MEEK C . Causal inference and causal explanation with background knowledge [C ] // Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence . New York : ACM Press , 1995 : 403 - 410 .
HE X , LI Y , TAN J , et al . OneShotSTL: one-shot seasonal-trend decomposition for online time series anomaly detection and forecasting [J ] . Proceedings of the VLDB Endowment , 2023 , 16 ( 6 ): 1399 - 1412 .
WU H , HU T , LIU Y , et al . TimesNet: temporal 2D-variation modeling for general time series analysis [C ] // Proceedings of the Eleventh International Conference on Learning Representations (ICLR) , Kigali, Rwanda : OpenReview.net , 2023 : 1 - 5 .
0
浏览量
5
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构