Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (6): 37-47    DOI: 10.11925/infotech.2096-3467.2017.1107
Current Issue | Archive | Adv Search |
Analyzing Public Opinion from Microblog with Topic Clustering and Sentiment Intensity
Wang Xiufang, Sheng Shu(), Lu Yan
College of Computer of Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China
Download: PDF (3060 KB)   HTML ( 2
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper builds a model to monitor the trending topics from microblogs, aiming to deal with the issues of text drifting and quantitation of sentimental polarity. [Methods] First, we proposed a public opinion analysis model based on topic clustering and sentiment intensity. Then, we used the time series regression analysis to predict the sentimental changes among the trending topics. [Results] The prediction accuracy of our model reached 88.97%, which was about 7% higher than the iLab-Edinburgh model. [Limitations] More research is needed to study the early warning mechanisms for emergency events. [Conclusions] The proposed model could improve the prediction accuracy of sentimental changes, which provides an effective way to analyze the public opinion from microblogs.

Key wordsPublic Opinion Analysis      Sentiment Analysis      Topic Clustering      Sentiment Intensity Analysis     
Received: 07 November 2017      Published: 11 July 2018
ZTFLH:  G353.1  

Cite this article:

Wang Xiufang,Sheng Shu,Lu Yan. Analyzing Public Opinion from Microblog with Topic Clustering and Sentiment Intensity. Data Analysis and Knowledge Discovery, 2018, 2(6): 37-47.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.1107     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I6/37

#话题1 #话题2 #话题3 #话题4 #话题5
5.4级 双摄 恋情 中国特色 实现
震感 拍照 恩爱 新时代 中国特色
绵阳 体验 祝福 中共 中华民族
广元市 IOS 11 不般配 党组织 梦想
平安 预售 公开 不忘初心 社会主义
汶川地震 剁手 迪丽热巴 北京 奋斗
自然灾害 销量 荧幕CP 习近平 十九
应急 发布 表白体 报告 两个一百年
伤亡 乔布斯 粉丝 央视新闻 价值观
文本数 话题 话题情感词、标号及权值
正面 负面 中性
300 #话题1 祈福: 3.1554; 天佑: 2.636; 加油: 1.845…… 担心; 3.245; 倒塌: 2.907; 惧怕: 0.7824…… 自救: 1.445; 灾难: 1.4574……
300 #话题2 好看: 3.785; 流畅: 2.535;
操作简单: 1.8553……
不便宜: 3.522; 性能差: 2.482; 不稳定: 2.2435; 卡: 1.5345…… 简单: 0.933; 还可以: 1.7734……
300 #话题3 祝福: 3.284; 接受: 2.3409;
喜欢: 2.184……
不般配: 3.484; 不支持: 3.233; 讨厌: 2.323…… 分手: 1.366; 失恋: 1.384……
300 #话题4 贺电: 3.568; 自豪: 3.157;
加油: 2.824; 期待: 2.646……
不关心: 3.233 …… 考试: 1.5428; 考研: 1.738……
300 #话题5 厉害: 3.549; 希望: 2.892;
最棒: 2.547……
困难: 2.783…… 学习: 1.4857; 努力: 1.626……
话题 #话题1 #话题2 #话题3 #话题4 #话题5
#话题1 1 0.1865 0.1296 0.4586 0.4132
#话题2 0.1865 1 0.2574 0.1968 0.1269
#话题3 0.1296 0.2574 1 0.2658 0.3326
#话题4 0.4586 0.1968 0.2658 1 0.8434
#话题5 0.4132 0.1269 0.3326 0.8434 1
序号 话题 整体热议指数 当月最高
1 #四川地震 493 12 500
2 #iPhone 8 57 863 769 470
3 #鹿晗关晓彤 58 429 785 282
4 #十九大 288 031 3 776 591
5 #中国梦 38 630 436 625
话题 #四川
地震
#iPhone 8 #鹿晗
关晓彤
#十九大 #中国梦
情感权重 1.173 1.256 1.582 1.486 1.248
情感强度 2.1987 2.3874 2.7834 2.5332 2.4021
[1] 马晓玲, 金碧漪, 范并思. 中文文本情感倾向分析研究[J]. 情报资料工作, 2013(1): 52-56.
[1] (Ma Xiaoling, Jin Biyi, Fan Bingsi.An Analysis of Chinese Text Emotional Tendency[J]. Information and Documentation Service, 2013(1): 52-56.)
[2] Vaibhavi N, Patodkar N P, Shaikh I R.Sentimental Analysis on Twitter Data Using Naive Bayes[C]// Proceedings of the 6th Post Graduate Conference for Computer Engineering. 2017.
[3] 唐晓波, 罗颖利. 融入情感差异和用户兴趣的微博转发预测[J]. 图书情报工作, 2017, 61(9): 102-110.
[3] (Tang Xiaobo, Luo Yingli.Integrating Emotional Divergence and User Interests into the Prediction of Microblog Retweeting[J]. Library and Information Service, 2017, 61(9): 102-110.)
[4] Ingle M M, Emmanues M.Evaluations on Sentiment Analysis of Micro Blogging Site Using Topic Modeling[C]// Proceedings of the 2016 International Conference on Signal Processing, Communication, Power and Embedded System. 2016.
[5] Giatsoglou M, Vozalis M G, Diamantaras K, et al.Sentiment Analysis Leveraging Emotions and Word Embeddings[J]. Expert Systems with Applications, 2017, 69: 214-224.
doi: 10.1016/j.eswa.2016.10.043
[6] 韩忠明, 陈妮, 乐嘉锦, 等. 面向热点话题时间序列的有效聚类算法研究[J]. 计算机学报, 2012, 35(11): 2337-2347.
doi: 10.3724/SP.J.1016.2012.02337
[6] (Han Zhongming, Chen Ni, Le Jiajin, et al.An Efficient and Effective Clustering Algorithm for Time Series of Hot Topics[J]. Chinese Journal of Computers, 2012, 35(11): 2337-2347.)
doi: 10.3724/SP.J.1016.2012.02337
[7] 吴青林, 周天宏. 基于话题聚类及情感强度的中文微博舆情分析[J]. 情报理论与实践, 2016, 39(1): 109-112.
doi: 10.16353/j.cnki.1000-7490.2016.01.019
[7] (Wu Qinglin, Zhou Tianhong.Public Opinion Analysis of Chinese Microblog Based on Topic Clustering and Emotion Intensity[J]. Information Studies: Theory & Application, 2016, 39(1): 109-112.)
doi: 10.16353/j.cnki.1000-7490.2016.01.019
[8] 何跃, 肖敏, 张月. 结合话题相关性的热点话题情感倾向研究[J]. 数据分析与知识发现, 2017, 1(3): 46-53.
[8] (He Yue, Xiao Min, Zhang Yue.Sentiment Analysis of Trending Topics Based on Relevance[J]. Data Analysis and Knowledge Discovery, 2017, 1(3): 46-53.)
[9] Sotiropoulos D N, Kounavis C D, Kourouthanassis P, et al.What Drives Social Sentiment? An Entropic Measure-based Clustering Approach Towards Identifying Factors that Influence Social Sentiment Polarity[C]// Proceedings of the 5th International Conference on Information, Intelligence, Systems and Applications. 2014.
[10] Manek A S, Shenoy P D, Mohan M C, et al.Aspect Term Extraction for Sentiment Analysis in Large Movie Reviews Using Gini Index Feature Selection Method and SVM Classifier[J]. World Wide Web, 2017, 20(2): 135-154.
doi: 10.1007/s11280-015-0381-x
[11] 李慧, 柴亚青. 基于属性特征的评论文本情感极性量化分析[J]. 数据分析与知识发现, 2017, 1(10): 1-11.
[11] (Li Hui, Chai Yaqing.Analysis Sentiment Polarity of Comments Based on Attributes[J]. Data Analysis and Knowledge Discovery, 2017, 1(10): 1-11.)
[12] Meisheri H, Saha R, Sinha P, et al.Textmining at EmoInt-2017: A Deep Learning Approach to Sentiment Intensity Scoring of English Tweets[C]//Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Copenhagen, Denmark. 2017.
[13] 郑丽娟, 王洪伟, 郭恺强. 基于情感词模糊统计的网络评论情感强度的研究[J]. 系统管理学报, 2014, 23(3): 324-330.
[13] (Zheng Lijuan, Wang Hongwei, Guo Kaiqiang.Sentiment Intensity of Online Reviews Based on Fuzzy-Statistics of Sentiment Words[J]. Journal of Systems & Management, 2014, 23(3): 324-330.)
[14] Pérez-Ortiz M, Gutiérrez P A, Carbonero-Ruz M, et al.Semi-supervised Learning for Ordinal Kernel Discriminant Analysis[J]. Neural Networks, 2016, 84: 57-66.
doi: 10.1016/j.neunet.2016.08.004 pmid: 27639724
[15] 周航星, 陈松灿. 有序判别典型相关分析[J]. 软件学报, 2014, 25(9): 2018-2025.
doi: 10.13328/j.cnki.jos.004649
[15] (Zhou Hangxing, Chen Songcan.Ordinal Discriminative Canonical Correlation Analysis[J]. Journal of Software, 2014, 25(9): 2018-2025.)
doi: 10.13328/j.cnki.jos.004649
[16] Hotelling H.Relations Between 2 Sets of Variants[J]. Biometrika, 1935, 28(3-4): 312-377.
doi: 10.2307/2333955
[17] Yoshida K, Yoshimoto J, Doya K.Sparse Kernel Canonical Correlation Analysis for Discovery of Nonlinear Interactions in High-dimensional Data[J]. BMC Bioinformatics, 2017, 18(1): 108-118.
doi: 10.1186/s12859-017-1543-x pmid: 28196464
[18] 钟敏娟, 万常选, 刘德喜. 基于关联规则挖掘和极性分析的商品评论情感词典构建[J]. 情报学报, 2016, 35(5): 501-509.
[18] (Zhong Minjuan, Wan Changxuan, Liu Dexi.Opinion Lexicon Construction Based on Association Rule and Orientation Analysis for Production Review[J]. Journal of the China Society for Scientific and Technical Information, 2016, 35(5): 501-509.)
[19] 刘德喜. 情感词扩展对微博情感分类性能影响的实验分析[J]. 小型微型计算机系统, 2016, 37(5): 957-965.
[19] (Liu Dexi.Effect of Sentimental Word Expansion on the Performance of Microblog Sentiment Classification Task[J]. Journal of Chinese Computer Systems, 2016, 37(5): 957-965.)
[20] 阳林. 情感词权值研究及在情感极性分析中的应用[J]. 计算机应用, 2015, 35(S2): 125-127.
[20] (Yang Lin.Emotional Term Weight Research and Application to Emotional Polarity Analysis[J]. Journal of Computer Applications, 2015, 35(S2): 125-127.)
[21] Van Arthur G, Staals F, Löffler M, et al.Multi-Granular Trend Detection for Time-Series Analysis[J]. IEEE Transactions on Visualization and Computer Graphics, 2017, 23(1): 661-670.
doi: 10.1109/TVCG.2016.2598619
[22] 唐晓波, 童海燕, 严承希. 基于话题情感强度的微博舆情分析[J]. 图书馆学研究, 2014(17): 85-93.
[22] (Tang Xiaobo, Tong Haiyan, Yan Chengxi.Microblogging Public Opinion Analysis Based on Emotional Intensity of the Topic[J]. Research on Libray Science, 2014(17): 85-93.)
[23] Refaee E, Rieser V. iLab-Edinburgh at SemEval-2016 Task 7: A Hybrid Approach for Determining Sentiment Intensity of Arabic Twitter Phrases[C]// Proceedings of the 10th International Workshop on Semantic Evaluation. 2016.
[1] Xu Hongxia,Yu Qianqian,Qian Li. Studying Content Interaction Data with Topic Model and Sentiment Analysis[J]. 数据分析与知识发现, 2020, 4(7): 110-117.
[2] Liang Ye,Li Xiaoyuan,Xu Hang,Hu Yiran. CLOpin: A Cross-Lingual Knowledge Graph Framework for Public Opinion Analysis and Early Warning[J]. 数据分析与知识发现, 2020, 4(6): 1-14.
[3] Jiang Lin,Zhang Qilin. Research on Academic Evaluation Based on Fine-Grain Citation Sentimental Quantification[J]. 数据分析与知识发现, 2020, 4(6): 129-138.
[4] Shi Lei,Wang Yi,Cheng Ying,Wei Ruibin. Review of Attention Mechanism in Natural Language Processing[J]. 数据分析与知识发现, 2020, 4(5): 1-14.
[5] Li Tiejun,Yan Duanwu,Yang Xiongfei. Recommending Microblogs Based on Emotion-Weighted Association Rules[J]. 数据分析与知识发现, 2020, 4(4): 27-33.
[6] Shen Zhuo,Li Yan. Mining User Reviews with PreLM-FT Fine-Grain Sentiment Analysis[J]. 数据分析与知识发现, 2020, 4(4): 63-71.
[7] Xue Fuliang,Liu Lifang. Fine-Grained Sentiment Analysis with CRF and ATAE-LSTM[J]. 数据分析与知识发现, 2020, 4(2/3): 207-213.
[8] Ying Tan,Jin Zhang,Lixin Xia. A Survey of Sentiment Analysis on Social Media[J]. 数据分析与知识发现, 2020, 4(1): 1-11.
[9] Hui Nie,Huan He. Identifying Implicit Features with Word Embedding[J]. 数据分析与知识发现, 2020, 4(1): 99-110.
[10] Yonghua Cen,Zhihao Tan,Chengyao Wu. Impacts of Financial Media Information on Stock Market: An Empirical Study of Sentiment Analysis[J]. 数据分析与知识发现, 2019, 3(9): 98-114.
[11] Weicong Lu,Jian Xu. Sentiment Analysis for Online User Reviews Based on Tripartite Network[J]. 数据分析与知识发现, 2019, 3(8): 10-20.
[12] Zhongxi You,Weina Hua,Xuelian Pan. Matching Book Reviews and Essential Sentiment Lexicons with Chinese Word Segmenters[J]. 数据分析与知识发现, 2019, 3(7): 23-33.
[13] Cuiqing Jiang,Yibo Guo,Yao Liu. Constructing a Domain Sentiment Lexicon Based on Chinese Social Media Text[J]. 数据分析与知识发现, 2019, 3(2): 98-107.
[14] Fen Chen,Xiaohuan Gao,Yue Peng,Yuan He,Chunxiang Xue. Identifying Weibo Opinion Leaders with Text Sentiment Analysis[J]. 数据分析与知识发现, 2019, 3(11): 120-128.
[15] Bengong Yu,Peihang Zhang,Qingtang Xu. Selecting Products Based on F-BiGRU Sentiment Analysis[J]. 数据分析与知识发现, 2018, 2(9): 22-30.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn