Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (7): 61-72    DOI: 10.11925/infotech.2096-3467.2017.0516
Orginal Article Current Issue | Archive | Adv Search |
Fine-grained Sentiment Analysis Based on Weibo
Xinhui Dun1,Yunqiu Zhang1(),Kaixi Yang2
1School of Public Health, Jilin University, Changchun 130021, China
2International School of Information Science & Engineering, Dalian University of Technology, Dalian 116620, China
Download: PDF(1622 KB)   HTML ( 3
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper conducts a fine-grained sentiment analysis of Weibo posts by dividing the sentiments into eight categories and calculating their intensity values. [Methods] First, we analyzed the Weibo corpus to construct the question word list. Besides the seven sentiments defined by DUTIR, we added “suspected” to the list. Then, we used the Pointwise Mutual Information method, the impacts of negative words and the degree adverbs to construct the expression symbol dictionary. We employed Python to retrieve the needed data from Weibo, and applied the jiebaR package to segment the words. Finally, we classified the sentiments and calculated their intensity. [Results] We got the proportion of eight sentiment categories and sentiment intensity of commonly used drugs for diabetes. The Precision values of “angry” and “sad” were the highest (85.73% and 83.05%), while the Recall and F values of “happy” and “like” were the highest (more than 81%). The Precision, Recall and F values of “suspected” were 77.33%, 78.58% and 77.95% respectively. [Limitations] The sentiment dictionary needs to be expanded. [Conclusions] The proposed model could analyze the sentiment of Weibo Posts more effectively than traditional methods.

Key wordsMicroblog      Fine-grained Sentiment Analysis      Drug     
Received: 31 May 2017      Published: 26 July 2017

Cite this article:

Xinhui Dun,Yunqiu Zhang,Kaixi Yang. Fine-grained Sentiment Analysis Based on Weibo. Data Analysis and Knowledge Discovery, 2017, 1(7): 61-72.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.0516     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2017/V1/I7/61

[1] CNNIC. 第39次中国互联网络发展状况统计报告[R]. 中国互联网络信息中心, 2017.
[1] (CNNIC.The Report of The 39th China Internet Development Statistics[R]. Information Center of the China Internet Network, 2017.)
[2] 蓝天广. 电子商务产品在线评论的细粒度情感强度分析[D]. 北京: 北京邮电大学, 2015.
[2] (Lan Tianguang.Fine-Grained Sentiment Analysis of E-Commerce Online Reviews [D]. Beijing: Beijing University of Posts and Telecommunications, 2015.)
[3] 李长江. 基于酒店中文评论情感倾向分析[D]. 广州: 华南理工大学, 2016.
[3] (Li Changjiang.Text Sentiment Polarity Analysis Based on Chinese Reviews in Hotel Domain [D]. Guangzhou: South China University of Technology, 2016.)
[4] 贾治中. 基于依存句法分析的中文评价对象抽取和情感倾向性分析[D]. 南京: 东南大学, 2016.
[4] (Jia Zhizhong.Chinese Opinion Target Extraction and Orientation Analysis Based on Syntactic Dependencies [D]. Nanjing: Southeast University, 2016.)
[5] 彭云, 万常选, 江腾蛟, 等. 基于语义约束LDA的商品特征和情感词提取[J]. 软件学报, 2017, 28(3): 676-693.
[5] (Peng Yun, Wan Changxuan, Jiang Tengjiao, et al.Extracting Product Aspects and User Opinions Based on Semantic Constrained LDA Model[J]. Journal of Software, 2017, 28(3): 676-693.)
[6] Pang B, Lee L, Vaithyanathan S.Thumbs up? Sentiment Classification Using Machine Learning Techniques[C]// Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, Philadelphia. USA: Association for Computational Linguistics, 2002: 79-86.
[7] 杨艳霞. 基于分类的微博情感分析算法研究及实现[J]. 计算机与数字工程, 2017, 45(2): 197-200, 396.
[7] (Yang Yanxia.Microblog Sentiment Analysis Algorithm Research and Implementation Based on Classification[J]. Computer & Digital Engineering, 2017, 45(2): 197-200, 396.)
[8] 陈炳丰, 郝志峰, 蔡瑞初, 等. 面向汽车评论的细粒度情感分析方法研究[J]. 广东工业大学学报, 2017, 34(3): 8-14.
[8] (Chen Bingfeng, Hao Zhifeng, Cai Ruichu, et al.A Fine-grained Sentiment Analysis Algorithm for Automotive Reviews[J]. Journal of Guangdong University of Technology, 2017, 34(3): 8-14.)
[9] 朱晓光. 基于半监督学习的微博情感分析方法研究[D]. 济南: 山东财经大学, 2014.
[9] (Zhu Xiaoguang.Research on Microblog Sentiment Analysis Based on Semi-supervised Learning [D]. Jinan: Shandong University of Finance and Economics, 2014.)
[10] 程佳军. 基于半监督递归自动编码的微博情感分析方法研究[D]. 长沙: 国防科学技术大学, 2014.
[10] (Cheng Jiajun.Research on Sentiment Analysis of Microblog Based on Semi-suprvise Recursive Auto Encoder [D]. Changsha: National University of Defense Technology, 2014.)
[11] 熊德兰, 程菊明, 田胜利. 基于HowNet的句子褒贬倾向性研究[J]. 计算机工程与应用, 2008, 44(22): 143-145.
[11] (Xiong Delan, Cheng Juming, Tian Shengli.Sentence Orientation Research Based on HowNet[J]. Computer Engineering and Applications, 2008, 44(22): 143-145.)
[12] 潘明慧, 牛耘. 基于多线索混合词典的微博情绪识别[J]. 计算机技术与发展, 2014, 24(9): 28-32, 36.
[12] (Pan Minghui, Niu Geng.Emotion Recognition of Micro-blogs Based on a Hybrid Lexicon[J]. Computer Technology and Development, 2014, 24(9): 28-32, 36.)
[13] 肖江, 丁星, 何荣杰. 基于领域情感词典的中文微博情感分析[J]. 电子设计工程, 2015, 23(12): 18-21.
[13] (Xiao Jiang, Ding Xing, He Rongjie.Analysis of Chinese Micro-blog Emotion Which Based on Field of Emotional Dictionary[J]. Electronic Design Engineering, 2015, 23(12): 18-21.)
[14] 王志涛, 於志文, 郭斌, 等. 基于词典和规则集的中文微博情感分析[J]. 计算机工程与应用, 2015, 51(8): 218-225.
[14] (Wang Zhitao, Yu Zhiwen, Guo Bin, et al.Sentiment Analysis of Chinese Micro Blog Based on Lexicon and Rule Set[J]. Computer Engineering and Applications, 2015, 51(8): 218-225.)
[15] 张珊, 于留宝, 胡长军. 基于表情图片与情感词的中文微博情感分析[J]. 计算机科学, 2012, 39(11A): 146-148, 176.
[15] (Zhang Shan, Yu Liubao, Hu Changjun.Sentiment Analysis of Chinese Micro-blogs Based on Emoticons and Emotional Words[J]. Computer Science, 2012, 39(11A): 146-148, 176.)
[16] 王文远, 王大玲, 冯时, 等. 一种面向情感分析的微博表情情感词典构建及应用[J]. 计算机与数字工程, 2012, 40(11): 6-9.
[16] (Wang Wenyuan, Wang Daling, Feng Shi, et al.An Approach of Building Microblog Smiley Emotion Lexicon and Its Application for Sentiment Analysis[J]. Computer & Digital Engineering, 2012, 40(11): 6-9.)
[17] 栗雨晴, 礼欣, 韩煦, 等. 基于双语词典的微博多类情感分析方法[J]. 电子学报, 2016, 44(9): 2068-2073.
[17] (Li Yuqing, Li Xin, Han Xu, et al.A Bilingual Lexicon-Based Multi-class Semantic Orientation Analysis for Microblogs[J]. Acta Electronica Sinica, 2016, 44(9): 2068-2073.)
[18] 何文娟. 微博情感营销对消费者购买意愿的影响研究[D]. 合肥: 安徽大学, 2016.
[18] (He Wenjuan.Research on the Influence of Microblog-Based Emotional Marketing on Consumers’ Purchase Intention[D]. Hefei: Anhui University, 2016.)
[19] 史伟, 王洪伟, 何绍义. 基于微博情感分析的电影票房预测研究[J]. 华中师范大学学报: 自然科学版, 2015, 49(1): 66-72.
[19] (Shi Wei, Wang Hongwei, He Shaoyi.Study on Predicting Movie Box Office Based on Sentiment Analysis of Micro-blog[J]. Journal of HuaZhong Normal University: Natural Sciences, 2015, 49(1): 66-72.)
[20] 李鸣, 吴波, 宋阳, 等. 细粒度情感分析的酒店评论研究[J]. 传感器与微系统, 2016, 35(12): 41-43, 47.
[20] (Li Ming, Wu Bo, Song Yang, et al.Research on Hotel Reviews Based on Fine-grained Sentiment Analysis[J]. Transducer and Microsystem Technologies, 2016, 35(12): 41-43, 47.)
[21] 钱慎一, 杨铁松. 基于微博电影评论的情感分析研究[J]. 现代计算机(专业版), 2017(5): 48-51.
[21] (Qian Shenyi, Yang Tiesong.Research on Emotional Analysis Based on Micro-Blog Film Criticism[J]. Modern Computer, 2017(5): 48-51.)
[22] 赵晓航. 基于情感分析与主题分析的“后微博”时代突发事件政府信息公开研究——以新浪微博“天津爆炸”话题为例[J]. 图书情报工作, 2016, 60(20): 104-111.
[22] (Zhao Xiaohang.The Study on Government News Release in the Era of Post-microblog Based on Sentiment Analysis and Subject Analysis: A Case Study of the “Tianjin Explosion” on Sina Microblog[J]. Library and Information Service, 2016, 60(20): 104-111.)
[23] 缪茹一. 基于文本数据挖掘的微博情感分析与监控系统[D]. 杭州: 浙江工业大学, 2015.
[23] (Miu Ruyi.Microblog Sentiment Analysis and Monitoring System Based on Text Data Mining [D]. Hangzhou: Zhejiang University of Technology, 2015.)
[24] 崔安颀. 微博热点事件的公众情感分析研究[D]. 北京: 清华大学, 2013.
[24] (Cui Anqi.Study on Public Sentiment Analysis of Events in Microblogs [D]. Beijing: Tsinghua University, 2013.)
[25] 陈建美. 中文情感词汇本体的构建及其应用[D]. 大连; 大连理工大学, 2009.
[25] (Chen Jianmei.The Construction and Application of Chinese Emotion Word Ontology[D]. Dalian: Dalian University of Technology, 2009.)
[26] 高宁. 现代汉语程度副词与否定副词共现的认知研究[D]. 长春: 吉林大学, 2013.
[26] (Gao Ning.A Cognitive Study on the Combination of the Degree Adverb and the Negative Adverb in Mandarin Chinese [D]. Changchun: Jilin University, 2013.)
[27] 施寒潇. 细粒度情感分析研究[D]. 苏州: 苏州大学, 2013.
[27] (Shi Hanxiao.Research on Fine-grained Sentiment Analysis [D]. Suzhou: Soochow University, 2013.)
[28] 陈国兰. 基于情感词典与语义规则的微博情感分析[J]. 情报探索, 2016(2): 1-6.
[28] (Chen Guolan.Microblog Sentiment Analysis Basing on Emotion Dictionary and Semantic Rule[J]. Information Research, 2016(2): 1-6.)
[29] 李婷婷, 姬东鸿. 基于SVM和CRF多特征组合的微博情感分析[J]. 计算机应用研究, 2015, 32(4): 978-981.
[29] (Li Tingting, Ji Donghong.Sentiment Analysis of Micro-blog Based on SVM and CRF Using Various Combinations of Features[J]. Application Research of Computers, 2015, 32(4): 978-981.)
[30] 马秉楠, 黄永峰, 邓北星. 基于表情符的社交网络情绪词典构造[J]. 计算机工程与设计, 2016, 37(5): 1129-1133.
[30] (Ma Bingnan, Huang Yongfeng, Deng Beixing.Generating Sentiment Lexicon of Online Social Network Based on Emotions[J]. Computer Engineering and Design, 2016, 37(5): 1129-1133.)
[31] 崔连超. 互联网评论文本情感分析研究[D]. 济南: 山东大学, 2015.
[31] (Cui Lianchao.Research on Internet Review Text Sentiment Analysis [D]. Ji’nan: Shandong University, 2015.)
[32] 郑诚, 杨希, 张吉赓. 结合情感词典与规则的微博情感极性分类方法[J]. 电脑知识与技术, 2014, 10(13): 3111-3113, 3123.
[32] (Zheng Cheng, Yang Xi, Zhang Jigeng.Combining Emotional Dictionary and Rules of Microblogging Emotional Polarity Classification Method[J]. Computer Knowledge and Technology, 2014, 10(13): 3111-3113, 3123.)
[33] 汪会琴, 胡如英, 武海滨, 等. 2型糖尿病报告发病率研究进展[J]. 浙江预防医学, 2016, 28(1): 37-39, 57.
[33] (Wang Huiqin, Hu Ruying, Wu Haibin, et al.Research Progress on Incidence of Type 2 Diabetes Mellitus[J]. Zhejiang Preventive Medicine, 2016, 28(1): 37-39, 57.)
[34] Li G, Hoi S C H, Chang K, et al. Microblogging Sentiment Detection by Collaborative Online Learning[C]//Proceedings of the 2010 IEEE International Conference on Data Mining, Sydney, Australia. USA: IEEE, 2010: 893-898.
[1] Lu An,Yanping Liang. Selection of Users’ Behaviors Towards Different Topics of Microblog on Public Health Emergencies[J]. 数据分析与知识发现, 2019, 3(4): 33-41.
[2] Xiaoxiao Zhu,Zunqi Yang,Jing Liu. Construction of an Adverse Drug Reaction Extraction Model Based on Bi-LSTM and CRF[J]. 数据分析与知识发现, 2019, 3(2): 90-97.
[3] Xinyue Fan,Lei Cui. Using Text Mining to Discover Drug Side Effects: Case Study of PubMed[J]. 数据分析与知识发现, 2018, 2(3): 79-86.
[4] Xinyue Fan,Lei Cui. Predicting Antineoplastic Drug Targets Based on Network Properties[J]. 数据分析与知识发现, 2018, 2(12): 98-108.
[5] Yongbing Gao,Guipeng Yang,Di Zhang,Zhanfei Ma. Detecting Events from Official Weibo Profiles Based on Post Clustering with Burst Words[J]. 数据分析与知识发现, 2017, 1(9): 57-64.
[6] Ruihua Qi. Identifying Chinese Microblog Author Gender Based on Dependency[J]. 数据分析与知识发现, 2017, 1(2): 58-63.
[7] Shuang Yang,Fen Chen. Analyzing Sentiments of Micro-blog Posts Based on Support Vector Machine[J]. 数据分析与知识发现, 2017, 1(2): 73-79.
[8] Xing Wei,Dehua Hu,Minhan Yi,Qizhen Zhu,Wenjie Zhu. Extracting Disease-Gene-Drug Correlations Based on Data Cube[J]. 数据分析与知识发现, 2017, 1(10): 94-104.
[9] Yao Zhaoxu,Ma Jing. Extracting Topic and Opinion from Microblog Posts with New Algorithm[J]. 现代图书情报技术, 2016, 32(7-8): 78-86.
[10] Li Yazi,Zheng Jianli,Zhou Yiyang,Li Guolei. Building a National System for the Reimbursable Prescription Drugs[J]. 现代图书情报技术, 2016, 32(6): 96-101.
[11] Chen Dongyi,Zhou Zicheng,Jiang Shengyi,Wang Lianxi,Wu Jialin. A Framework for Customer Segmentation on Enterprises’ Microblog[J]. 现代图书情报技术, 2016, 32(2): 43-51.
[12] Wu Wankun, Wu Qinglie, Gu Jinjiang. Hot Topic Extraction from E-commerce Microblog Based on EM-LDA Integrated Model[J]. 现代图书情报技术, 2015, 31(11): 33-40.
[13] Ye Chuan, Ma Jing. Research on Topic Discovery Algoritm of Multimedia Microblog Comments Information[J]. 现代图书情报技术, 2015, 31(11): 51-59.
[14] He Yumei, Qi Jiayin, Liu Huili. The Study of Local-world Network Evolution Model Based on Microblog[J]. 现代图书情报技术, 2014, 30(5): 66-73.
[15] Li Bing, Xu Weijia, Zhang Jingxuan. The Research of Products Evaluation Using Microblogging Data with “Android System” Evaluation as an Example[J]. 现代图书情报技术, 2014, 30(4): 92-98.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn