浏览全部资源
扫码关注微信
1. 燕山大学信息科学与工程学院,河北 秦皇岛 066004
2. 河北省信息传输与信号处理重点实验室,河北 秦皇岛 066004
3. 河北省计算机虚拟技术实验室,河北 秦皇岛 066004
[ "刘洺辛(1976-),男,博士,燕山大学信息科学与工程学院教授,河北省信息传输与信号处理重点实验室硕士生导师,主要研究方向为物联网、无线传感器网络。" ]
[ "陈晶(1976-),女,博士,燕山大学信息科学与工程学院副教授,河北省计算机虚拟技术实验室硕士生导师,主要研究方向为Web服务、社交网络情感分析。" ]
[ "王麒媛(1991-),女,燕山大学信息科学与工程学院硕士生,主要研究方向为无线传感器网络、情感分析。" ]
网络出版日期:2018-10,
纸质出版日期:2018-10-20
移动端阅览
刘洺辛, 陈晶, 王麒媛. 基于改进特征选择方法的文本情感分类研究[J]. 电信科学, 2018,34(10):85-95.
Mingxin LIU, Jing CHEN, Qiyuan WANG. Research on text sentiment classification based on improved feature selection method[J]. Telecommunications science, 2018, 34(10): 85-95.
刘洺辛, 陈晶, 王麒媛. 基于改进特征选择方法的文本情感分类研究[J]. 电信科学, 2018,34(10):85-95. DOI: 10.11959/j.issn.1000-0801.2018250.
Mingxin LIU, Jing CHEN, Qiyuan WANG. Research on text sentiment classification based on improved feature selection method[J]. Telecommunications science, 2018, 34(10): 85-95. DOI: 10.11959/j.issn.1000-0801.2018250.
提出了结合情感词典的改进信息增益特征选择方法。首先,针对现有的信息增益特征选择存在注重特征词的文档频率而忽视语料均衡等问题,提出了改进方法。其次,考虑情感词对文本分类的影响,提出了基于情感词典的特征选择(information gain combining sentiment classification,IGSC)算法进行文本分类。该算法通过对文本情感词进行匹配并结合情感词赋权重,实现了特征降维并解决了文本数据稀疏影响分类性能的问题;最后,针对旅游评论数据集对所提出的特征选择方法进行了实验验证及分析。实验结果表明,本文提出的改进文本情感分类特征选择方法在分类准确率、召回率和F值方面均得到了提升,并且具有较好的分类稳定性。
An improved information gain feature selection method based on sentiment dictionary was proposed.Firstly
aiming at the existing problems of information gain feature selection
such as paying attention to the frequency of feature word and ignoring the balance of corpus
an improved method was proposed.Secondly
considering the influence of sentiment words in text classification
a feature selection method IGSC (information gain combining sentiment classification) based on sentiment dictionary was proposed for text classification.By matching the text emotion words and combining the weight of emotion words
the feature dimension reduction was realized and the problem of text data sparseness affecting classification performance was solved.Finally
according to the proposed feature selection method of travel review data set for experimental verification and analysis
the experimental results show that the improved text sentiment classification feature selection method has been improved in terms of classification accuracy
recall and F value
and classification has better stability.
马晓玲 , 金碧漪 , 范并思 . 中文文本情感倾向分析研究 [J ] . 情报资料工作 , 2013 ( 1 ): 52 - 56 .
MA X L , JIN B Y , FAN B S . An analysis of Chinese text emotional tendency [J ] . Information and Documentation Services , 2013 ( 1 ): 52 - 56 .
DAS S , CHEN M . Yahoo! for amazon:extracting market sentiment from stock message boards [R ] . 2001 .
TURNEY P D , . Thumbs up or thumbs down?:semantic orientation applied to unsupervised classification of reviews [C ] // Meeting on Association for Computational Linguistics,July 7-12,Philadelphia,Pennsylvania . New York:ACM Press , 2002 : 417 - 424 .
MORAES R , VALIATI J F , NETO W P G , et al . Document-level sentiment classification:an empirical comparison between SVM and ANN [J ] . Expert Systems with Applications , 2013 , 40 ( 2 ): 621 - 633 .
HADDI E , LIU X H , SHI Y . The role of text pre-processing in sentiment analysis [J ] . Procedia Computer Science , 2013 ( 17 ): 26 - 32 .
KANG H , YOO S J , HAN D . Senti-lexicon and improved NaïveBayes algorithms for sentiment analysis of restaurant reviews [J ] . Expert Systems with Applications , 2012 , 39 ( 5 ): 6000 - 6010 .
YU L C , WU J L , CHANG P C . Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news [J ] . Knowledge-Based Systems , 2013 ( 41 ): 89 - 97 .
高伟 , 王中卿 , 李寿山 . 基于集成学习的半监督情感分类方法研究 [J ] . 中文信息学报 , 2013 , 27 ( 3 ): 120 - 126 .
GAO W , WANG Z Q , LI S S . Semi-supervised sentiment classification with a ensemble strategy [J ] . Journal of Chinese Information Processing , 2013 , 27 ( 3 ): 120 - 126 .
赵传君 , 王素格 , 李德玉 . 基于分组提升集成的跨领域文本情感分类 [J ] . 计算机研究与发展 , 2015 , 52 ( 3 ): 629 - 638 .
ZHAO C J , WANG S G , LI D Y . Cross-domain text sentiment classification based on grouping-Adaboost ensemble [J ] . Journal of Computer Research and Development , 2015 , 52 ( 3 ): 629 - 638 .
张越兵 , 苗夺谦 , 张志飞 . 基于三支决策的多粒度文本情感分类模型 [J ] . 计算机科学 , 2017 , 44 ( 12 ): 188 - 193 ,215.
ZHANG Y B , MIAO D Q , ZHANG Z F . Multi-granularity text sentiment classification model based on three decision making [J ] . Computer Science , 2017 , 44 ( 12 ): 188 - 193 ,215.
于海燕 , 陈丽如 , 郑文斌 . 基于核超限学习机的中文文本情感分类 [J ] . 中国计量学院学报 , 2016 , 27 ( 2 ): 228 - 233 .
YU H Y , CHEN L R , ZHENG W B . Chinese text sentiment classification based on kernel extreme learning machines [J ] . Journal of China Jiliang University , 2016 , 27 ( 2 ): 228 - 233 .
朱宪莹 , 刘箴 , 金炜 , 等 . 基于特征融合的层次结构微博情感分类 [J ] . 电信科学 , 2016 , 32 ( 7 ): 106 - 114 .
ZHU X Y , LIU Z , JIN W , et al . Hierarchical micro-blog sentiment classification based on feature fusion [J ] . Telecommunications Science , 2016 , 32 ( 7 ): 106 - 114 .
CATAL C , NANGIR M . A sentiment classification model based on multiple classifiers [J ] . Applied Soft Computing , 2017 ( 50 ): 135 - 141 .
ONAN A , KORUKOĞLU S , BULUT H . A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification [M ] . Oxford:Pergamon Press,Inc . 2016 .
杜慧 , 徐学可 , 伍大勇 . 基于情感词向量的微博情感分类 [J ] . 中文信息学报 , 2017 , 31 ( 3 ): 170 - 176 .
DU H , XU X K , WU D Y . A sentiment classification method based on sentiment-specific word embedding [J ] . Journal of Chinese Information Processing , 2017 , 31 ( 3 ): 170 - 176 .
LIU Y , BI J W , FAN Z P . A method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algorithm [J ] . Information Sciences , 2017 ( 394-395 ): 38 - 52 .
李然 , 林政 , 林海伦 . 文本情绪分析综述 [J ] . 计算机研究与发展 , 2018 , 55 ( 1 ): 30 - 52 .
LI R , LIN Z , LIN H L . Summary of text sentiment analysis [J ] . Journal of Computer Research and Development , 2018 , 55 ( 1 ): 30 - 52 .
LIU S M , CHEN J H . A multi-label classification based approach for sentiment classification [J ] . Expert Systems with Applications , 2015 , 42 ( 3 ): 1083 - 1093 .
ZHANG X , LI W , LU S . Emotion detection in online social network based on multi-label learning [C ] // 22nd International Conference on Database Systems for Advanced Applications,March 27-30,2017,Suzhou,China . Berlin:Springer , 2017 : 659 - 674 .
LI J , RAO Y , JIN F , et al . Multi-label maximum entropy model for social emotion classification over short text [J ] . Neurocomputing , 2016 ( 210 ): 247 - 256 .
0
浏览量
785
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构