Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (2): 73-79    DOI: 10.11925/infotech.2096-3467.2017.02.10
Orginal Article Current Issue | Archive | Adv Search |
Analyzing Sentiments of Micro-blog Posts Based on Support Vector Machine
Yang Shuang(), Chen Fen
School of Economics and Management, Nanjing University of Science & Technology, Nanjing 210094, China
Download: PDF (466 KB)   HTML ( 30
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a new method based on the Support Vector Machine to monitor online public opinion. [Methods] We extracted fourteen linguistic characteristics of the micro-blog posts and analysed their sentiments with Support Vector Machine. [Results] The precision, recall and F value of the proposed method were 82.40%, 81.91%, and 82.10%, respectively. [Limitations] The size of training corpus needs to be expanded. [Conclusions] The proposed method could effectively analyze sentiments of micro-blog posts.

Key wordsMicroblog      Sentiment Analysis      Support Vector Machine      Parsing     
Received: 29 August 2016      Published: 27 March 2017
ZTFLH:  G35 TP391  

Cite this article:

Yang Shuang,Chen Fen. Analyzing Sentiments of Micro-blog Posts Based on Support Vector Machine. Data Analysis and Knowledge Discovery, 2017, 1(2): 73-79.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.02.10     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2017/V1/I2/73

权重 示例 个数
2.0 百分之百、绝对、非常、超、过于…… 99
1.5 很、多么、更加、不胜…… 78
1.0 比较、较为、多多少少…… 13
0.5 稍微、略为、不怎么、不为过…… 54
特征类型 含义
词性特征 微博中含有的动词数量(F1)
微博中含有的形容词数量(F2)
微博中含有的副词数量(F3)
情感特征 微博中含有的正面情感词数量(F4)
微博中含有的负向情感词数量(F5)
微博中程度副词的最高权重(F6)
微博的情感得分(F7)
句式特征 否定词的数量(F8)
感叹号的数量(F9)
问号的数量(F10)
语义特征 与情感词有关的副词性修饰语(F11)
与情感词有关的形容词性修饰语(F12)
与情感词有关的名词性主语(F13)
类别 数量
非常正面 217
正面 1 149
中立 2 081
负面 1 239
非常负面 304
特征
情感值
F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13
+2 1: 2 2: 0 3: 2 4: 2 5: 0 6: 2.0 7: 4.0 8: 0 9: 3 10: 0 11: 1 12: 0 13: 1
+1 1: 4 2: 2 3: 3 4: 3 5: 0 6: 0.0 7: 1.0 8: 0 9: 1 10: 0 11: 0 12: 2 13: 2
-2 1: 2 2: 2 3: 0 4: 0 5: 2 6: 2.0 7: -4.0 8: 1 9: 0 10: 1 11: 2 12: 0 13: 0
-1 1: 3 2: 5 3: 3 4: 1 5: 4 6: 1.0 7: -2.0 8: 3 9: 0 10: 6 11: 1 12: 3 13: 0
0 1: 3 2: 2 3: 3 4: 1 5: 0 6: 0 7: 1.0 8: 2 9: 1 10: 1 11: 2 12: 3 13: 0
实验 特征组合 准确率
1 词性 57.60%
2 词性+情感词 80.93%
3 词性+情感词+程度副词权重 81.76%
4 词性+情感词+程度副词权重+情感得分 81.95%
5 词性+情感词+程度副词权重+情感得分+
否定词
82.14%
6 词性+情感词+程度副词权重+情感得分+
否定词+问号和感叹号
82.22%
7 词性+情感词+程度副词权重+情感得分+
否定词+问号和感叹号+语义特征
82.40%
方法 准确率 召回率 F1值
本文方法 82.40% 81.91% 82.10%
层叠CRFs方法 75.31% 73.30% 74.30%
[1] 王雪猛, 王玉平. 基于情感倾向分析的突发事件网络舆情预警研究[J]. 西南科技大学学报: 哲学社会科学版, 2016, 33(1): 63-66.
doi: 10.3969/j.issn.51-1660/C.2016.01.011
[1] (Wang Xuemeng, Wang Yuping.Research of Emergency Network Public Sentiment Warning Based on the Analysis of Emotional Tendency[J]. Journal of Southwest University of Science and Technology: Philosophy and Social Science Edition, 2016, 33(1): 63-66.)
doi: 10.3969/j.issn.51-1660/C.2016.01.011
[2] Kamps J, Marx M, Mokken R J, et al.Using WordNet to Measure Semantic Orientations of Adjectives[C]// Proceedings of the 4th International Conference on Language Resources and Evaluation. 2004.
[3] Shen Y, Li S, Zheng L, et al.Emotion Mining Research on Micro-blog[C]// Proceedings of the 1st IEEE Symposium on Web Society. 2009.
[4] 郑诚, 杨希, 张吉赓. 结合情感词典与规则的微博情感极性分类方法[J]. 电脑知识与技术, 2014, 10(13): 3111-3113.
[4] (Zheng Cheng, Yang Xi, Zhang Jigeng.Micro-blog Sentiment Analysis of Combined Sentiment Dictionary and Rules[J]. Computer Knowledge and Technology, 2014, 10(13): 3111-3113.)
[5] 张阳, 刘晓霞, 孙凯龙, 等. 基于情感描述项的文本倾向性识别研究[J]. 计算机工程与应用, 2015, 51(4): 158-161, 195.
doi: 10.3778/j.issn.1002-8331.1304-0321
[5] (Zhang Yang, Liu Xiaoxia, Sun Kailong, et al.Research on Text Orientation Identification Based on Emotional Description Item[J]. Computer Engineering and Applications, 2015, 51(4): 158-161, 195.)
doi: 10.3778/j.issn.1002-8331.1304-0321
[6] Pang B, Lee L, Vaithyanathan S.Thumbs up? Sentiment Classification Using Machine Learning Techniques[C]// Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing. 2002.
[7] Borbosa L, Feng J.Robust Sentiment Detection on Twitter from Biased and Noisy Data [C]//Proceedings of the 23rd International Conference on Computational Linguistics. Beijing: Tsinghua University Press. 2010.
[8] Davidov D, Tsur O, Rappoport A.Enhanced Sentiment Learning Using Twitter Hashtags and Smileys[C]// Proceedings of the 23rd International Conference on Computational Linguistics: Posters, 2010: 241-249.
[9] 夏梦南, 杜永萍, 左本欣. 基于依存分析与特征组合的微博情感分析[J]. 山东大学学报: 理学版, 2014, 49(11): 22-30.
doi: 10.6040/j.issn.1671-9352.3.2014.074
[9] (Xia Mengnan, Du Yongping, Zuo Benxin.Micro-blog Opinion Analysis Based on Syntactic Dependency and Feature Combination[J]. Journal of Shandong University: Natural Science, 2014, 49(11): 22-30.)
doi: 10.6040/j.issn.1671-9352.3.2014.074
[10] Ding S, Jiang T, Wen N.Research on Sentiment Orientation of Product Reviews in Chinese Based on Cascaded CRFs Models[C]//Proceeding of the 2012 International Conference on Machine Learning and Cybernetics (ICMLC 2012). IEEE, 2012.
[11] 魏晶晶, 吴晓吟. 电子商务产品评论多级情感分析的研究与实现[J]. 软件, 2013, 34(9): 65-67, 94.
doi: 10.3969/j.issn.1003-6970.2013.09.020
[11] (Wei Jingjing, Wu Xiaoyin.Research on Multi-level Sentiment Analysis System of E-Commerce Product Review and Implementation[J]. Software, 2013, 34(9): 65-67, 94.)
doi: 10.3969/j.issn.1003-6970.2013.09.020
[12] 廖健, 王素格, 李德玉, 等. 基于观点袋模型的汽车评论情感极性分类[J]. 中文信息学报, 2015, 29(3): 113-120.
[12] (Liao Jian, Wang Suge, Li Deyu, et al.The Bag-of-Opinions Method for Car Review Sentiment Polarity Classification[J]. Journal of Chinese Information Processing, 2015, 29(3): 113-120.)
[13] Word2Vec [EB/OL]. [2015-01-12]. .
[14] Liu Z, Yu W, Chen W, et al.Short Text Feature Selection for Micro-blog Mining[C]//Proceedings of the 2010 International Conference on Computational Intelligence and Software Engineering. IEEE, 2010.
[15] 吴明芬, 陈涛. 基于SVM的以词性和依存关系为特征的句子倾向性判断分析[J]. 五邑大学学报: 自然科学版, 2012, 26(4): 66-71.
doi: 10.3969/j.issn.1006-7302.2012.04.015
[15] (Wu Mingfen, Chen Tao.Sentences Tendency Judgement by POS and Dependency Based on SVM[J]. Journal of Wuyi University: Natural Science Edition, 2012: 26(4): 66-71.)
doi: 10.3969/j.issn.1006-7302.2012.04.015
[16] 刘海涛. 依存语法的理论与实践[M].北京: 科学出版社, 2009.
[16] (Liu Haitao.Dependency Grammar: From Theory to Practice [M]. Beijing: Science Press, 2009.)
[17] Stanford Parser [EB/OL]. [2015-06-16]. .
[18] 彭玥. 基于文本倾向性的网络意见领袖识别[D]. 南京: 南京理工大学, 2014.
[18] (Peng Yue.Internet Opinion Leader Detection Based on Text Sentiment Analysis [D]. Nanjing: Nanjing University of Science and Technology, 2014.)
[19] NLPIR/ICTCLAS [EB/OL]. [2015-12-02]. .
[20] LibSVM [EB/OL]. [2015-07-12].https://www.csie.ntu.edu.tw/~cjlin/libsvm/.
[1] Fan Tao,Wang Hao,Wu Peng. Sentiment Analysis of Online Users' Negative Emotions Based on Graph Convolutional Network and Dependency Parsing[J]. 数据分析与知识发现, 2021, 5(9): 97-106.
[2] Xu Yuemei, Wang Zihou, Wu Zixin. Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[3] Zhong Jiawa,Liu Wei,Wang Sili,Yang Heng. Review of Methods and Applications of Text Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[4] Liu Tong,Liu Chen,Ni Weijian. A Semi-Supervised Sentiment Analysis Method for Chinese Based on Multi-Level Data Augmentation[J]. 数据分析与知识发现, 2021, 5(5): 51-58.
[5] Xu Zheng,Le Xiaoqiu. Generating AND-OR Logical Expressions for Semantic Features of Categorical Documents[J]. 数据分析与知识发现, 2021, 5(5): 95-103.
[6] Wang Yuzhu,Xie Jun,Chen Bo,Xu Xinying. Multi-modal Sentiment Analysis Based on Cross-modal Context-aware Attention[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[7] Li Feifei,Wu Fan,Wang Zhongqing. Sentiment Analysis with Reviewer Types and Generative Adversarial Network[J]. 数据分析与知识发现, 2021, 5(4): 72-79.
[8] Chang Chengyang,Wang Xiaodong,Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. 数据分析与知识发现, 2021, 5(3): 121-131.
[9] Zhang Mengyao, Zhu Guangli, Zhang Shunxiang, Zhang Biao. Grouping Microblog Users of Trending Topics Based on Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(2): 43-49.
[10] Han Pu, Zhang Wei, Zhang Zhanpeng, Wang Yuxin, Fang Haoyu. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[11] Feng Hao, Li Shuqing. Multi-layer Cascade Classifier for Credit Scoring with Multiple-Support Vector Machines[J]. 数据分析与知识发现, 2021, 5(10): 28-36.
[12] Lv Huakui,Liu Zhenghao,Qian Yuxing,Hong Xudong. Relationship Between Financial News and Stock Market Fluctuations[J]. 数据分析与知识发现, 2021, 5(1): 99-111.
[13] Xi Yunjiang, Du Diedie, Liao Xiao, Zhang Xuehong. Analyzing & Clustering Enterprise Microblog Users with Supernetwork[J]. 数据分析与知识发现, 2020, 4(8): 107-118.
[14] Xu Hongxia,Yu Qianqian,Qian Li. Studying Content Interaction Data with Topic Model and Sentiment Analysis[J]. 数据分析与知识发现, 2020, 4(7): 110-117.
[15] Jiang Lin,Zhang Qilin. Research on Academic Evaluation Based on Fine-Grain Citation Sentimental Quantification[J]. 数据分析与知识发现, 2020, 4(6): 129-138.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn