浏览全部资源
扫码关注微信
新疆财经大学统计与数据科学学院,新疆 乌鲁木齐 830012
[ "王圣节(1997- ),男,新疆财经大学博士生,主要研究方向为机器学习及其应用、深度学习。" ]
[ "张庆红(1973- ),女,博士,新疆财经大学教授、博士生导师,主要研究方向为统计学。" ]
收稿日期:2024-04-17,
修回日期:2024-05-21,
纸质出版日期:2024-07-20
移动端阅览
王圣节,张庆红.基于可解释机器学习模型的电信行业客户流失预测研究[J].电信科学,2024,40(07):121-133.
WANG Shengjie,ZHANG Qinghong.Research on telecom industry customer churn prediction based on explainable machine learning models[J].Telecommunications Science,2024,40(07):121-133.
王圣节,张庆红.基于可解释机器学习模型的电信行业客户流失预测研究[J].电信科学,2024,40(07):121-133. DOI: 10.11959/j.issn.1000-0801.2024166.
WANG Shengjie,ZHANG Qinghong.Research on telecom industry customer churn prediction based on explainable machine learning models[J].Telecommunications Science,2024,40(07):121-133. DOI: 10.11959/j.issn.1000-0801.2024166.
在电信行业中,客户流失的准确预测对于相关企业维持市场竞争力和增加收益至关重要。为此提出一个结合CatBoost算法和SHAP(shapley additive explanations)模型的客户流失预测框架,旨在提高预测的准确性,同时增强模型的可解释性。利用新疆某通信公司的实际营业数据,通过数据预处理及特征工程,构建预测模型,选取5种主要关键性能指标评估模型性能。实验结果显示,所提出模型在选取的评价指标上均优于当前主流机器学习预测模型。最后引入SHAP框架增强模型可解释性,揭示影响客户流失的关键因素,并提供具体的因素影响程度,为电信企业制定针对性的客户保留策略提供了科学依据。
In the telecom industry
accurate prediction of customer churn is crucial for the companies involved to maintain market competitiveness and increase revenue. To this end
a customer churn prediction framework combining CatBoost algorithm and SHAP model was proposed
aiming to improve the accuracy of prediction and enhance the interpretability of the model. Using the actual business data of a communication company in Xinjiang
the prediction model was constructed through data preprocessing and feature engineering
and five major key performance indicators were selected to evaluate the model performance. The experimental results show that the proposed model outperforms the current mainstream machine learning prediction models in all the above evaluation indicators. Finally
the SHAP framework was introduced to enhance the model interpretability
reveal the key factors affecting customer churn
and provide the specific influence degree of the factors
which provided a scientific basis for telecommunications enterprises to formulate targeted customer retention strategies.
BRANDUSOIU I , TODEREAN G . Churn prediction in the telecommunications sector using support vector machines [J ] . Margin , 2013 , 1 ( 1 ): 19 - 22 .
SAHAR F . Machine-learning techniques for customer retention: a comparative study [J ] . International Journal of Advanced Computer Science and Applications , 2018 , 9 ( 2 ): 273 - 281
JAIN H , KHUNTETA A , SRIVASTAVA S . Churn prediction in telecommunication using logistic regression and logit boost [J ] . Procedia Computer Science , 2020 , 167 : 101 - 112 .
STEHANI S , KARUNYA N , RANJAN D R J B , et al . Customer churn reasoning in telecommunication domain [C ] // Proceedings of the 2020 International Conference on Image Processing and Robotics (ICIP) . Piscataway : IEEE Press , 2020 : 1 - 5 .
KHAMLICHI F I , ZAIM D , KHALIFA K . A new model based on global hybridization of machine learning techniques for "customer churn prediction" [C ] // Proceedings of the 2019 Third International Conference on Intelligent Computing in Data Sciences (ICDS) . Piscataway : IEEE Press , 2019 : 1 - 4 .
PAMINA J , RAJA B , SATHYABAMA S , et al . An effective classifier for predicting churn in telecommunication [J ] . Jour of Adv Research in Dynamical & Control Systems , 2019 ( 11 ): 221 - 229 .
TANG P . Telecom customer churn prediction model combining K-means and XGBoost algorithm [C ] // Proceedings of the 2020 5th International Conference on Mechanical, Control and Computer Engineering (ICMCCE) . Piscataway : IEEE Press , 2020 : 1128 - 1131 .
WU S L , YAU W C , ONG T S , et al . Integrated churn prediction and customer segmentation framework for telco business [J ] . IEEE Access , 2021 ( 9 ): 62118 - 62136 .
ZHANG T Y , MORO S , RAMOS R F . A data-driven approach to improve customer churn prediction based on telecom customer segmentation [J ] . Future Internet , 2022 , 14 ( 3 ): 94 .
LALWANI P , MISHRA M K , CHADHA J S , et al . Customer churn prediction system: a machine learning approach [J ] . Computing , 2022 , 104 ( 2 ): 271 - 294 .
汪明达 , 周俏丽 , 蔡东风 . 采用混合模型的电信领域用户流失预测 [J ] . 计算机工程与应用 , 2019 , 55 ( 24 ): 214 - 221, 270 .
WANG M D , ZHOU Q L , CAI D F . User churn prediction in telecom domain using hybrid model [J ] . Computer Engineering and Applications , 2019 , 55 ( 24 ): 214 - 221, 270 .
OWCZARCZUK M . Churn models for prepaid customers in the cellular telecommunication industry using large data marts [J ] . Expert Systems with Applications , 2010 , 37 ( 6 ): 4710 - 4712 .
SENTHAN P , RATHNAYAKA R , KUHANESWARAN B , et al . Development of churn prediction model using XGBoost - telecommunication industry in Sri Lanka [C ] // Proceedings of the 2021 IEEE International IoT, Electronics and Mechatronics Conference (IEMTRONICS) . Piscataway : IEEE Press , 2021 : 1 - 7 .
SULIKOWSKI P , ZDZIEBKO T . Churn factors identification from real-world data in the telecommunications industry: case study [J ] . Procedia Computer Science , 2021 ( 192 ): 4800 - 4809 .
ULLAH I , RAZA B , MALIK A K , et al . A churn prediction model using random forest: analysis of machine learning techniques for churn prediction and factor identification in telecom sector [J ] . IEEE Access , 2019 ( 7 ): 60134 - 60149 .
SHRESTHA S M , SHAKYA A . A customer churn prediction model using XGBoost for the telecommunication industry in Nepal [J ] . Procedia Computer Science , 2022 ( 215 ): 652 - 661 .
PENG K , PENG Y . Research on telecom customer churn prediction based on ga-xgboost and shap [J ] . Journal of Computer and Communications , 2022 , 10 ( 11 ): 107 - 120 .
OLIVEIRA G X C . Addressing data imbalance in customer churn prediction: A novel approach for telecommunications companies [D ] . Lisboa : Dissertação de mestrado, ISCTE-Instituto Universitário de Lisboa , 2023 .
PROKHORENKOVA L , GUSEV G , VOROBEV A , et al . CatBoost: unbiased boosting with categorical features [C ] // Proceedings of the 32nd International Conference on Neural Information Processing Systems . New York : ACM , 2018 : 6639 - 6649 .
贾潇瑶 . 融合CatBoost和SHAP的乳腺癌预测及特征分析 [J ] . 计算机与现代化 , 2023 ( 10 ): 32 - 38 .
JIA X Y . Breast cancer prediction and feature analysis model based on CatBoost and SHAP [J ] . Computer and Modernization , 2023 ( 10 ): 32 - 38 .
马朔 , 李钊 , 赵军 . 基于CatBoost用信预测模型的TreeSHAP解释性研究 [J ] . 计算机系统应用 , 2023 , 32 ( 3 ): 338 - 344 .
MA S , LI Z , ZHAO J . Research on interpretative TreeSHAP
based on CatBoost's credit utilization prediction model [J ] . Computer Systems and Applications , 2023 , 32 ( 3 ): 338 - 344 .
LUNDBERG S M , LEE S I . A unified approach to interpreting model predictions [C ] // Proceedings of the 31st International Conference on Neural Information Processing Systems . New York : ACM , 2017 : 4768 - 4777 .
廖彬 , 王志宁 , 李敏 , 等 . 融合XGBoost与SHAP模型的足球运动员身价预测及特征分析方法 [J ] . 计算机科学 , 2022 , 49 ( 12 ): 195 - 204 .
LIAO B , WANG Z N , LI M , et al . Integrating XGBoost and SHAP model for football player value prediction and characteristic analysis [J ] . Computer Science , 2022 , 49 ( 12 ): 195 - 204 .
0
浏览量
13
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构