浏览全部资源
扫码关注微信
[ "刘国庆(1995− ),男,杭州电子科技大学计算机学院硕士生,主要研究方向为软件测试、机器学习。 " ]
[ "王兴起(1974− ),男 ,博士,杭州电子科技大学教授,主要研究方向为数据挖掘、软件测试等。 " ]
[ "魏丹(1979− ),女,博士,杭州电子科技大学讲师,主要研究方向为数据挖掘、智能软件工程等" ]
[ "方景龙(1964− ),男,博士,杭州电子科技大学教授,主要研究方向为智能软件工程、机器学习和人工智能等" ]
[ "邵艳利(1989− ),女,博士,杭州电子科技大学副研究员,主要研究方向为 CAD&CAE、MBSE、智能软件工程和数据挖掘等" ]
网络出版日期:2021-05,
纸质出版日期:2021-05-20
移动端阅览
刘国庆, 王兴起, 魏丹, 等. 基于最大信息系数的软件缺陷数目预测特征选择方法[J]. 电信科学, 2021,37(5):133-147.
Guoqing LIU, Xingqi WANG, Dan WEI, et al. Feature selection method for software defect number prediction based on maximum information coefficient[J]. Telecommunications science, 2021, 37(5): 133-147.
刘国庆, 王兴起, 魏丹, 等. 基于最大信息系数的软件缺陷数目预测特征选择方法[J]. 电信科学, 2021,37(5):133-147. DOI: 10.11959/j.issn.1000-0801.2021025.
Guoqing LIU, Xingqi WANG, Dan WEI, et al. Feature selection method for software defect number prediction based on maximum information coefficient[J]. Telecommunications science, 2021, 37(5): 133-147. DOI: 10.11959/j.issn.1000-0801.2021025.
针对传统特征选择方法仅考虑变量间的线性关系而忽略非线性相关性,导致软件缺陷数目预测模型的性能较低的问题,提出了一种基于最大信息系数的特征选择方法。该方法考虑特征与特征以及特征与缺陷数目间的线性及非线性关系,将特征的冗余性分析和相关性分析分离为两个阶段。在冗余特征分析阶段,基于特征间的相关度,采用凝聚层次聚类算法将冗余特征分到同一簇中;在相关性分析阶段,依据特征与软件缺陷数目之间的相关度,对每个特征簇中的特征进行排序,然后从簇中选择排名靠前的特征组成特征子集。实验结果表明,该方法能够选择有效的特征子集,提高软件缺陷数目预测模型的预测性能。
The traditional feature selection method only considers the linear correlation between variables and ignores the nonlinear correlation
so it is difficult to select effective feature subsets to build the effective model to predict the number of faults in software modules.Considering the linear and nonlinear relationship
a feature selection method based on maximum information coefficient (MIC) was proposed.The proposed method separated the redundancy analysis and correlation analysis into two phases.In the previous phase
the cluster algorithm
which was based on the correlation between features
was used to divide the redundant features into the same cluster.In the later phase
the features in each cluster were sorted in descending order according to the correlation between features and the number of software defects
and then the top features were selected to form the feature subset.The experimental results show that the proposed method can improve the prediction performance of software defect number prediction model by effectively removing redundant and irrelevant features.
宫丽娜 , 姜淑娟 , 姜丽 . 软件缺陷预测技术研究进展 [J ] . 软件学报 , 2019 , 30 ( 10 ): 3090 - 3114 .
GONG L N , JIANG S J , JIANG L . Research progress of software defect prediction [J ] . Journal of Software , 2019 , 30 ( 10 ): 3090 - 3114 .
刘望舒 , 陈翔 , 顾庆 , 等 . 软件缺陷预测中基于聚类分析的特征选择方法 [J ] . 中国科学: 信息科学 , 2016 , 46 ( 9 ): 1298 .
LIU W S , CHEN X , GU Q , et al . A cluster-analysis-based feature-selection method for software defect prediction [J ] . SCIENTIA SINICA Informationis , 2016 , 46 ( 9 ): 1298 .
GRAVES T L , KARR A F , MARRON J S , et al . Predicting fault incidence using software change history [J ] . IEEE Transactions on Software Engineering , 2000 , 26 ( 7 ): 653 - 661 .
OSTRAND T J , WEYUKER E J , BELL R M . Predicting the location and number of faults in large software systems [J ] . IEEE Transactions on Software Engineering , 2005 , 31 ( 4 ): 340 - 355 .
CHEN M , MA Y . An empirical study on predicting defect numbers [C ] // Proceeding of the 27th International Conference on Software Engineering and Knowledge Engineering .[S.l.:s.n. ] , 2015 : 397 - 402 .
RATHORE S S , KUMAR S . Predicting number of faults in software system using genetic programming [J ] . Procedia Computer Science , 2015 : 303 - 311 .
RATHORE S S , KUMAR S . A decision tree regression based approach for the number of software faults prediction [J ] . ACM Sigsoft Software Engineering Notes , 2016 , 41 ( 1 ): 1 - 6 .
RATHORE S S , KUMAR S . An empirical study of some software fault prediction techniques for the number of faults prediction [J ] . Soft Computing , 2017 , 21 ( 24 ): 7417 - 7434 .
CHEN X , ZHANG D , ZHAO Y , et al . Software defect number prediction: unsupervised vs supervised methods [J ] . Information and Software Technology , 2019 ( 106 ): 161 - 181 .
RATHORE S S , KUMAR S . Towards an ensemble based system for predicting the number of software faults [J ] . Expert Systems with Applications , 2017 ( 82 ): 357 - 382 .
RATHORE S S , KUMAR S . Linear and non-linear heterogeneous ensemble methods to predict the number of faults in software systems [J ] . Knowledge-Based Systems , 2017 ( 119 ): 232 - 256 .
YU X , LIU J , YANG Z , et al . Learning from imbalanced data for predicting the number of software defects [C ] // Proceeding of the International Symposium on Software Reliability Engineering .[S.l.:s.n. ] , 2017 : 78 - 89 .
刘洺辛 , 陈晶 , 王麒媛 . 基于改进特征选择方法的文本情感分类研究 [J ] . 电信科学 , 2018 , 34 ( 10 ): 85 - 95 .
LIU M X , CHEN J , WANG Q Y . Research on text sentiment classification based on improved feature selection method [J ] . Telecommunications Science , 2018 , 34 ( 10 ): 85 - 95 .
李叶飞 , 官国飞 , 葛崇慧 . FSDNP: 针对软件缺陷数预测的特征选择方法 [J ] . 计算机工程与应用 , 2019 , 55 ( 14 ): 61 - 68 .
LI Y F , GUAN G F , GE C H . FSDNP: feature selection method for software defect number prediction [J ] . Computer Engineering and Applications , 2019 , 55 ( 14 ): 61 - 68 .
YU X , MA Z , MA C , et al . FSCR: a feature selection method for software defect prediction [C ] // Proceeding of the 29th International Conference on Software Engineering and Knowledge Engineering .[S.l.:s.n. ] , 2017 : 351 - 356 .
马子逸 , 马传香 , 刘瑞奇 . 面向软件缺陷个数预测的混合式特征选择方法 [J ] . 计算机应用研究 , 2018 , 35 ( 2 ): 487 - 502 .
MA Z Y , MA C X , LIU R Q . Hybrid feature selection method for number of software faults prediction [J ] . Application Research of Computers , 2018 , 35 ( 2 ): 487 - 502 .
RESHEF D N , RESHEF Y A , FINUCANE H K , et al . Detecting novel associations in large data sets [J ] . Science , 2011 , 334 ( 6062 ): 1518 - 1524 .
GAO K , KHOSHGOFTAAR T M , WANG H , et al . An empirical investigation of filter attribute selection techniques for software quality classification [C ] // Proceeding of the 10th IEEE international conference on Information Reuse & Integration . Piscataway: IEEE Press , 2009 : 272 - 277 .
JURECZKO M . Signifificance of different software metrics in defect prediinformation reuse and integrationction [J ] . Software Engineering , 2011 , 1 ( 1 ): 86 - 95 .
0
浏览量
277
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构