|
|
Sentiment Analysis for Micro-blogs with LDA and AdaBoost |
Zeng Ziming(), Yang Qianwen |
Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China |
|
|
Abstract [Objective] The paper aims to improve the performance of sentiment analysis for micro-blog texts with the help of LDA model and AdaBoost algorithm. [Methods] First, we used the LDA topic model to extract topics of micro-blog posts. Then, we merged the emotional and sentence pattern features. Finally, we trained the proposed sentiment analysis model with the AdaBoost ensemble classification method. [Results] The topic feature posed significant positive impacts on emotion recognition therefore, model with topic and emotional features yielded the best results. The precision of the proposed model reached 84.512%, while the recall reached 83.160%. [Limitations] The sample size needs to be expanded, and the sentiment dictionary should be improved too. We did not study the emoticons from the micro-blog posts. [Conclusions] The proposed AdaBoost model with LDA could effectively identify emotional tendencies.
|
Received: 17 January 2018
Published: 08 September 2018
|
|
[1] |
何跃, 朱灿. 基于微博的意见领袖网情感特征分析——以“非法疫苗”事件为例[J]. 数据分析与知识发现, 2017, 1(9): 65-73.
|
[1] |
(He Yue, Zhu Can.Sentiment Analysis of Weibo Opinion Leaders—Case Study of ‘Illegal Vaccine’ Event[J]. Data Analysis and Knowledge Discovery, 2017, 1(9): 65-73.)
|
[2] |
徐健. 基于网络用户情感分析的预测方法研究[J]. 中国图书馆学报, 2013, 39(3): 96-107.
doi: 10.3969/j.issn.1001-8867.2013.03.022
|
[2] |
(Xu Jian.Research on Predicting Methods Based on Network User Sentiment Analysis[J]. Journal of Library Science in China, 2013, 39(3): 96-107.)
doi: 10.3969/j.issn.1001-8867.2013.03.022
|
[3] |
崔安颀. 微博热点事件的公众情感分析研究[D]. 北京: 清华大学, 2013.
|
[3] |
(Cui Anqi.Study on Public Sentiment Analysis of Events in Microblogs[D]. Beijing: Tsinghua University, 2013.)
|
[4] |
Pang B, Lee L.Opinion Mining and Sentiment Analysis[J]. Foundations and Trends in Information Retrival, 2008, 2(1-2): 1-135.
doi: 10.1561/1500000011
|
[5] |
陈晓东. 基于情感词典的中文微博情感倾向分析研究[D].武汉: 华中科技大学, 2012.
|
[5] |
(Chen Xiaodong.Research on Sentiment Dictionary Based Emotional Tendency Analysis of Chinese MicroBlog[D]. Wuhan: Huazhong University of Science and Technology, 2012.)
|
[6] |
史伟, 王洪伟, 何绍义. 基于语义的中文在线评论情感分析[J]. 情报学报, 2013, 32(8): 860-867.
doi: 10.3772/j.issn.1000-0135.2013.08.009
|
[6] |
(Shi Wei, Wang Hongwei, He Shaoyi.Sentiment Analysis of Chinese Online Reviews Based on Semantics[J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(8): 860-867.)
doi: 10.3772/j.issn.1000-0135.2013.08.009
|
[7] |
韩旭. 社交网络中短文本情感分析技术研究[D]. 天津: 天津大学, 2014.
|
[7] |
(Han Xu.Research on Technology of Short-Text Sentiment Analysis in Social Network[D].Tianjin: Tianjin University, 2014.)
|
[8] |
Pang B, Lee L, Vaithyanathan S.Thumbs up? Sentiment Classification Using Machine Learning Techniques[C]// Proceedings of Conference on Empirical Methods in Natural Language Processing. 2002: 79-86.
|
[9] |
丁晟春, 孟美任, 李霄. 面向中文微博的观点句识别研究[J]. 情报学报, 2014, 33(2): 175-182.
|
[9] |
(Ding Shengchun, Meng Meiren, Li Xiao.Study of Subjective Sentence Identification Oriented to Chinese Microblog[J]. Journal of the China Society for Scientific and Technical Information, 2014, 33(2): 175-182.)
|
[10] |
毛龙龙. 基于LDA模型的微博情感分析技术研究[D]. 兰州: 西北师范大学, 2015.
|
[10] |
(Mao Longlong.Research on Microblog Sentiment Analysis Technology Based the LDA Model [D]. Lanzhou: Northwest Normal University, 2015.)
|
[11] |
苏莹, 张勇, 胡珀, 等. 基于朴素贝叶斯与潜在狄利克雷分布相结合的情感分析[J]. 计算机应用, 2016, 36(6): 1613-1618.
doi: 10.11772/j.issn.1001-9081.2016.06.1613
|
[11] |
(Su Ying, Zhang Yong, Hu Po, et al.Sentiment Analysis Research Based on Combination of Naive Bayes and Latent Dirichlet Allocation[J]. Journal of Computer Applications, 2016, 36(6): 1613-1618.)
doi: 10.11772/j.issn.1001-9081.2016.06.1613
|
[12] |
唐晓波, 朱娟, 杨丰华. 基于情感本体和kNN算法的在线评论情感分类研究[J]. 情报理论与实践, 2016, 39(6): 110-114.
|
[12] |
(Tang Xiaobo, Zhu Juan, Yang Fenghua.Research on Emotional Classification of Online Reviews Based on Emotional Ontology and kNN Algorithm[J]. Information Studies: Theory & Application, 2016, 39(6): 110-114.)
|
[13] |
Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J].Journal of Machine Learning Research, 2003, 3: 993-1022.
|
[14] |
张培晶, 宋蕾. 基于LDA的微博文本主题建模方法研究述评[J]. 图书情报工作, 2012, 56(24): 120-126.
|
[14] |
(Zhang Peijing, Song Lei.Overview on Topic Modeling Method of Microblogs Text Based on LDA[J]. Library and Information Service, 2012, 56(24): 120-126.)
|
[15] |
唐晓波, 向坤. 基于LDA模型和微博热度的热点挖掘[J].图书情报工作, 2014, 58(5): 58-63.
doi: 10.13266/j.issn.0252-3116.2014.05.010
|
[15] |
(Tang Xiaobo, Xiang Kun.Hotspot Mining Based on LDA Model and Microblog Heat[J]. Library and Information Service, 2014, 58(5): 58-63.)
doi: 10.13266/j.issn.0252-3116.2014.05.010
|
[16] |
Stevens K, Kegelmeyer P, Andrzejewski D, et al.Exploring Topic Coherence over Many Models and Many Topics[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Korea. 2012.
|
[17] |
Mimno D, Wallach H M, Talley E, et al.Opitimizing Semantic Coherence in Topic Models[C]//Proceedings of Conference on Emperical Methods in Natural Language Processing.2011: 262-272.
|
[18] |
Hatfield E, Cacioppo J L, Rapson R L.Emotional Contagion[J]. Current Directions in Psychological Sciences, 1993, 2: 96-99.
doi: 10.1111/1467-8721.ep10770953
|
[19] |
Freund Y, Schipare R E.A Decision-Theoretic Generalization of On-line Learning and an Application to Boosting[C]// Proceedings of the 2nd European Conference on Computational Learning Theory. 1995: 23-37.
|
[20] |
曹莹, 苗启广, 刘家辰, 等. AdaBoost算法研究进展与展望[J]. 自动化学报, 2013, 39(6): 745-758.
doi: 10.3724/SP.J.1004.2013.00745
|
[20] |
(Cao Ying, Miao Qiguang, Liu Jiachen.Advance and Prospects of AdaBoost Algorithm[J]. Acta Automatica Sinica, 2013, 39(6): 745-758.)
doi: 10.3724/SP.J.1004.2013.00745
|
[21] |
张志飞, 苗夺谦, 高灿. 基于LDA主题模型的短文本分类方法[J]. 计算机应用, 2013, 33(6): 1587-1590.
doi: 10.3724/SP.J.1087.2013.01587
|
[21] |
(Zhang Zhifei, Miao Duoqian, Gao Can.Short Text Classification Using Latent Dirichlet Allocation[J]. Journal of Computer Application, 2013, 33(6): 1587-1590.)
doi: 10.3724/SP.J.1087.2013.01587
|
[22] |
王义真, 郑啸, 后盾, 等. 基于SVM的高维混合特征短文本情感分类[J]. 计算机技术与发展, 2018, 28(2): 88-93.
|
[22] |
(Wang Yizhen, Zheng Xiao, Hou Dun, et al.Short Text Sentiment Classification of High Dimensional Hybrid Feature Based on SVM[J]. Computer Technology and Development, 2018, 28(2): 88-93.)
|
[23] |
贺鸣, 孙建军, 成颖. 基于朴素贝叶斯的文本分类研究综述[J]. 情报科学, 2016, 34(7): 147-154.
|
[23] |
(He Ming, Sun Jianjun, Cheng Ying.Text Classification Based on Naïve Bayes: A Review[J]. Information Science, 2016, 34(7): 147-154.)
|
[24] |
周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
|
[24] |
(Zhou Zhihua.Machine Learning[M]. Beijing: Tsinghua University Press, 2016.)
|
[25] |
敦欣卉, 张云秋, 杨铠西. 基于微博的细粒度情感分析[J].数据分析与知识发现, 2017, 1(7): 61-72.
|
[25] |
(Guo Xinhui, Zhang Yunqiu, Yang Kaixi.Fine-grained Sentiment Analysis Based on Weibo[J]. Data Analysis and Knowledge Discovery, 2017, 1(7): 61-72.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|