Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (5): 71-81    DOI: 10.11925/infotech.2096-3467.2017.05.09
Orginal Article Current Issue | Archive | Adv Search |
Classifying Sentiments Based on BPSO Random Subspace
Zhang Qingqing1,2(), Liu Xilin2
1School of Management, Xi’an Polytechnic University, Xi’an 710048, China
2School of Management, Northwestern Polytechnical University, Xi’an 710129, China
Download: PDF (1107 KB)   HTML ( 1
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper aims to solve the issue of representing high dimensional features in Chinese sentiment analysis, with the help of RS_BPSO, a selective ensemble algorithm. [Methods] First, we developed the framework and algorithm of the proposed RS_BPSO model based on the theory of Random Subspace and Binary Particle Optimization. Then, we transformed the Chinese review corpus into structured feature vectors and examined the new model. [Results] We found that the diversity and accuracy of the RS_BPSO model better than the standard RS model. [Limitations] We did not run the proposed model with corpus in foreign languages. [Conclusions] The RS_BPSO model could be an effective method to classify Chinese sentiments.

Key wordsRandom Subspace      BPSO      Text Sentiment Classification      Subspace Rate     
Received: 28 March 2017      Published: 06 June 2017
ZTFLH:  TP391.1  

Cite this article:

Zhang Qingqing,Liu Xilin. Classifying Sentiments Based on BPSO Random Subspace. Data Analysis and Knowledge Discovery, 2017, 1(5): 71-81.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.05.09     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2017/V1/I5/71

数据集 三元组依存关系
酒店 140 911
图书 66 297
笔记本电脑 28 932
质量法 公式 编号
Q统计 ${{Q}_{ij}}=\frac{{{N}^{11}}{{N}^{00}}-{{N}^{10}}{{N}^{01}}}{{{N}^{11}}{{N}^{00}}+{{N}^{10}}{{N}^{01}}}$ (6)
相关系数$\rho $ ${{\rho }_{ij}}=\frac{{{N}^{11}}{{N}^{00}}-{{N}^{10}}{{N}^{01}}}{\sqrt{({{N}^{11}}+{{N}^{10}})({{N}^{01}}+{{N}^{00}})({{N}^{11}}+{{N}^{01}})({{N}^{10}}+{{N}^{00}})}}$ (7)
不一致度量dis $di{{s}_{ij}}=({{N}^{10}}+{{N}^{01}})/N$ (8)
双次失败度量DF $D{{F}_{ij}}=\frac{{{N}^{00}}}{N}$ (9)
k 酒店 图书 笔记本电脑
k=0.01 1 409 663 289
k=0.02 2 818 1 326 579
k=0.03 4 227 1 989 868
k=0.05 7 046 3 315 1 447
总个数 140 911 66 297 28 932
k RS RS_BPSO
0.01 0.6825 0.8342(17)
0.02 0.7183 0.8013(14)
0.03 0.7717 0.8293(13)
0.05 0.8075 0.8429(19)
k RS RS_BPSO
0.01 0.6867 0.8270(19)
0.02 0.7033 0.8434(19)
0.03 0.7633 0.8208(20)
0.05 0.785 0.8325(21)
k RS RS_BPSO
0.01 0.7867 0.8517(24)
0.02 0.8267 0.8762(29)
0.03 0.8067 0.8717(28)
0.05 0.8233 0.8634(22)
k DF dis Q统计 相关系数$\rho $
RS RS_BPSO RS RS_BPSO RS RS_BPSO RS RS_BPSO
0.01 0.3668 0.3715 0.4378 0.466 0.1507 0.0127 0.0972 0.0263
0.02 0.4396 0.4437 0.3759 0.4153 0.3794 0.1699 0.1958 0.0864
0.03 0.4677 0.4862 0.3718 0.379 0.3612 0.2837 0.179 0.136
0.05 0.5289 0.5452 0.333 0.3266 0.4448 0.4434 0.2144 0.2099
k DF dis Q统计 相关系数$\rho $
RS RS_BPSO RS RS_BPSO RS RS_BPSO RS RS_BPSO
0.01 0.321 0.3174 0.4701 0.4963 0.0667 -0.0321 0.048 -0.0099
0.02 0.3751 0.3834 0.4383 0.4585 0.1594 0.0477 0.0903 0.0351
0.03 0.4094 0.4079 0.409 0.44 0.2615 0.1071 0.1368 0.0589
0.05 0.4543 0.4576 0.3895 0.4115 0.2935 0.1663 0.1448 0.079
k DF dis Q统计 相关系数$\rho $
RS RS_BPSO RS RS_BPSO RS RS_BPSO RS RS_BPSO
0.01 0.3284 0.3271 0.4722 0.4986 0.0422 -0.0616 0.0399 -0.021
0.02 0.3753 0.3796 0.4559 0.4629 0.0482 0.0233 0.061 0.0265
0.03 0.4114 0.4073 0.428 0.441 0.1462 0.077 0.0875 0.057
0.05 0.4731 0.4764 0.3879 0.3909 0.2504 0.2225 0.1276 0.1146
[1] Agarwal B, Mittal N.Machine Learning Approach for Sentiment Analysis [A]// Prominent Feature Extraction for Sentiment Analysis[M]. Springer, International Publishing, 2016: 21-45.
[2] Vinodhini G, Chandrasekaran R.Sentiment Analysis and Opinion Mining: A Survey[J]. International Journal of Advanced Research in Computer Science and Software Engineering, 2012, 2(6): 282-292.
doi: 10.1007/978-1-4899-7502-7_907-1
[3] Liu B, Zhang L.A Survey of Opinion Mining and Sentiment Analysis [A].// Mining Text Data[M]. Springer US, 2012.
[4] 张庆庆, 刘西林. 基于依存句法关系的文本情感分类研究[J]. 计算机工程与应用, 2015, 51(22): 28-32.
doi: 10.3778/j.issn.1002-8331.1508-0237
[4] (Zhang Qingqing, Liu Xilin.Sentiment Analysis Based on Dependency Sytactic Relation[J]. Computer Engineering and Applications, 2015, 51(22): 28-32.)
doi: 10.3778/j.issn.1002-8331.1508-0237
[5] Wang G, Sun J, Ma J, et al.Sentiment Classification: The Contribution of Ensemble Learning[J]. Decision Support Systems, 2014, 57(1): 77-93.
doi: 10.1016/j.dss.2013.08.002
[6] Wang G, Zhang Z, Sun J, et al.POS-RS: A Random Subspace Method for Sentiment Classification Based on Part-of-Speech Analysis[J]. Information Processing & Management, 2015, 51(4): 458-479.
doi: 10.1016/j.ipm.2014.09.004
[7] Dasarathy B V, Sheela B V.A Composite Classifier System Design: Concepts and Methodology[J]. Proceedings of the IEEE, 1979, 67(5): 708-713.
doi: 10.1109/PROC.1979.11321
[8] Polikar R.Ensemble Based Systems in Decision Making[J]. IEEE Circuits and Systems Magazine, 2006, 6(3): 21-45.
doi: 10.1109/MCAS.2006.1688199
[9] Dietterich T G.Ensemble Methods in Machine Leanring[C]// Proceedings of the 1st International Workshop on Multiple Classifier Systems.2000.
[10] Ho T K.The Random Subspace Method for Constructing Decision Forests[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(8): 832-844.
doi: 10.1109/34.709601
[11] 孙博, 王建东, 陈海燕, 等. 集成学习中的多样性度量[J]. 控制与决策, 2014, 29(3): 385-395.
doi: 10.13195/j.kzyjc.2013.1334
[11] (Sun Bo, Wang Jiandong, Chen Haiyan, et al.Diversity Measures in Ensemble Learning[J]. Control and Decision, 2014, 29(3): 385-395.
doi: 10.13195/j.kzyjc.2013.1334
[12] Zhou Z H, Wu J X, Jiang Y, et al.Genetic Algorithm Based Selective Neural Network Ensemble[C]// Proceedings of the 17th International Joint Conference on Artificial Intelligence. 2001.
[13] Tama B A, Rhee K H.A Combination of PSO-Based Feature Selection and Tree-Based Classifiers Ensemble for Intrusion Detection Systems [A].// Advances in Computer Science and Ubiquitous Computing[M]. Singapore: Springer, 2015.
[14] Hedeshi N G, Abadeh M S.Coronary Artery Disease Detection Using a Fuzzy-boosting PSO Approach [J]. Computational Intelligence and Neuroscience, 2014, 2014: Article No. 783734. .
[15] Tsai C Y, Chen C J.A PSO-AB Classifier for Solving Sequence Classification Problems[J]. Applied Soft Computing, 2015, 27: 11-27.
doi: 10.1016/j.asoc.2014.10.029
[16] Kennedy J, Eberhart R C.A Discrete Binary Version of the Particle Swarm Algorithm[C]//Proceedings of the 1997 Conference on Systems, Man, and Cybernetics. 1997: 4104-4108.
[17] Chandra A, Chen H, Yao X.Trade-off Between Diversity and Accuracy in Ensemble Generation [A]// Multi-objective Machine Learning[M]. Springer Berlin Heidelberg, 2006.
[18] Ko A H R, Sabourin R, De Souza Britt Jr A. Combining Diversity and Classification Accuracy for Ensemble Selection in Random Subspaces[C]//Proceedings of the International Joint Conference on Neural Networks.2006.
[19] Ko A H R, Sabourin R, De Souza Britto Jr A. Compound Diversity Functions for Ensemble Selection[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2009, 23(4): 659-686.
doi: 10.1142/S021800140900734X
[1] Qingqing Zhang,Xingshi He,Huimin Wang,Shengjun Meng. Text Sentiment Classification Based on Deep Belief Network[J]. 数据分析与知识发现, 2019, 3(4): 71-79.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn