Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (11): 68-79    DOI: 10.11925/infotech.2096-3467.2021.0339
Current Issue | Archive | Adv Search |
Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel
Han Pu1,2(),Zhang Wei1,Zhang Zhanpeng1,Wang Yuxin1,Fang Haoyu1
1School of Management, Nanjing University of Posts & Telecommunications, Nanjing 210003, China
2Jiangsu Provincial Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
Download: PDF (1292 KB)   HTML ( 10
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a multi-channel MCMF-A model for Weibo posts based on feature fusion and attention mechanism, aiming to further explore the semantic information of public health emergency. [Methods] Firstly, we generated word vectors with Word2vec and FastText at the feature vector embedding level, which were merged with the vectors of part-of-speech features and position features. Secondly, we constructed multi-channel layer based on CNN and BiLSTM to extract local and global features of Weibo posts. Thirdly, we utilized the attention mechanism to extract important features of the texts. Finally, we merged the multi-channel output results, and used the softmax function for sentiment classification. [Results] We examined MCMF-A model with 42 384 Weibo posts on COVID-19. The F1 value of the proposed model reached 90.21%, which was 9.71% and 9.14% higher than the benchmark CNN and BiLSTM models. [Limitations] More research is needed to expand the experiment data size to include more small and multi-modal information such as images and voices. [Conclusions] The proposed model could effectively conduct sentiment analysis with Weibo posts.

Key wordsMulti-Channel      Feature Fusion      Deep Learning      Sentiment Analysis      Public Health Emergencies     
Received: 07 April 2021      Published: 23 December 2021
ZTFLH:  G350  
Fund:National Social Science Fund of China(17CTQ022);National Innovation Training Program for College Students(SZDG2020040);Jiangsu Graduate Research and Innovation Program Fund Project(KYCX20_0844)
Corresponding Authors: Han Pu,ORCID:0000-0001-5867-4292     E-mail: hanpu@njupt.edu.cn

Cite this article:

Han Pu, Zhang Wei, Zhang Zhanpeng, Wang Yuxin, Fang Haoyu. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel. Data Analysis and Knowledge Discovery, 2021, 5(11): 68-79.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.0339     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I11/68

Experimental Flowchart
情感词名称 说明 词性标注
Positive Comment Words 正面评价词 PC
Positive Sentiment Words 正面情感词 PS
Negative Comment Words 负面评价词 NC
Negative Sentiment Words 负面情感词 NS
Degree Words 程度词 ADV
Negative Words 否定词 INVER
POS Tagging Rules
Multi-Feature Vector Fusion Process
Neural Network Structure Based on WMF
实验参数名称 说明 参数值
Max Length of Sentences
Size of Word Vector
Size of Sentiment Feature Vector
Size of Position Feature Vector
Batch Size
Window Size
Number of Feature Map
Hidden Size of BiLSTM
epochs
最大文本序列长度
词向量的维度
词性特征向量的维度
位置特征向量的维度
每批数据量的大小
卷积核窗口大小
卷积核个数
BiLSTM隐藏层大小
样本训练次数
200
100
30
20
256
[3, 4, 5]
100
256
10
Learning Rate 学习速率 0.01
Dropout 随机断开输入神经元的比例 0.50
Optimizer 优化器 Adam
Model Parameter Setting
数据集 类别 数量 总计
微博语料 正类 21 192 42 384
负类 21 192
Annotation Statistics of Micro-Blog Data
情感极性 微博文本
正类(1) 【“他们抗击疫情很成功”#纽约时报记者点赞中国战疫#】“中国抗击疫情很成功!”“不要觉得方舱舞可笑,那是医治的妙招”“隔离,是中国阻断疫情传播的诀窍”美国《纽约时报》资深健康与科技记者Donald McNeil,日前公开点赞中国“战疫”,纠正了西方媒体的歪曲和误读,认为与意大利和美国的应对形成了鲜明的对比,这段视频已经在全球已有数百万人次观看转发。
负类(0) 【#武汉肺炎疫情病原体并非SARS病毒#】连日来,武汉肺炎疫情是由新型SARS病毒引发的说法在网络流传。中国疾控中心公号1月18日发布科普文章《关于武汉病毒性肺炎,这5大谣言千万别信!》,文章提到,引起武汉病毒性肺炎疫情的病原体不是SARS病毒。目前调查显示,该病毒人际间传播能力和致病性均较SARS弱。#武汉新增4例新型冠状病毒肺炎病例#。
Sample of Micro-Blog Data
模型 P/% R/% F1/%
SVM 75.71 70.09 72.79
CNN
RNN
80.98
78.01
80.02
78.19
80.50
78.10
LSTM
BiLSTM
79.13
81.32
79.33
80.83
79.23
81.07
Results of Five Benchmark Model Experiments
模型 P/% R/% F1/%
CNN-G 82.04 81.41 81.72
CNN-W2V 82.16 81.73 81.94
CNN-WMF 83.77 82.90 83.33
CNN-FT 82.54 81.92 82.23
CNN-FMF 84.13 83.74 83.93
BiLSTM-G 83.26 83.00 83.13
BiLSTM-W2V 83.34 83.19 83.26
BiLSTM-WMF 84.49 84.57 84.53
BiLSTM-FT 83.67 83.79 83.73
BiLSTM-FMF 85.32 85.09 85.20
Results of Multi-Feature Fusion Experiments
模型 P/% R/% F1/%
CNN-WMF-A 85.02 84.87 84.94
CNN-FMF-A 85.34 85.03 85.18
BiLSTM-WMF-A 85.97 85.73 85.85
BiLSTM-FMF-A 86.42 86.14 86.28
Results of Attention Mechanism Experiments
模型 P/% R/% F1/%
CNN-BiLSTM-WMF-A 87.63 87.56 87.59
CNN-BiLSTM-FMF-A 88.07 88.43 88.25
MCMF-A 90.45 89.98 90.21
Results of Multi-Channel Experiments
[1] 满媛媛, 刘佳宁. 国内突发事件网络舆情研究进展[J]. 情报科学, 2020, 38(12):170-177.
[1] (Man Yuanyuan, Liu Jianing. Research Progress of Network Public Opinion on Emergencies in China[J]. Information Science, 2020, 38(12):170-177.)
[2] 罗双玲, 夏昊翔, 王延章. 微博社会网络及传播研究评述[J]. 情报学报, 2015, 34(12):1304-1313.
[2] (Luo Shuangling, Xia Haoxiang, Wang Yanzhang. Review on Research of Social Networks of Micro-Blogging and Its Propagation Dynamics[J]. Journal of the China Society for Scientific and Technical Information, 2015, 34(12):1304-1313.)
[3] 刘忠宝, 秦权, 赵文娟. 微博环境下新冠肺炎疫情事件对网民情绪的影响分析[J]. 情报杂志, 2021, 40(2):138-145.
[3] (Liu Zhongbao, Qin Quan, Zhao Wenjuan. Research on the Influence of COVID-19 Event on the Netizen Emotion under the Microblog Environment[J]. Journal of Intelligence, 2021, 40(2):138-145.)
[4] 常城扬, 王晓东, 张胜磊. 基于深度学习方法对特定群体推特的动态政治情感极性分析[J]. 数据分析与知识发现, 2021, 5(3):121-131.
[4] (Chang Chengyang, Wang Xiaodong, Zhang Shenglei. Polarity Analysis of Dynamic Political Sentiments from Tweets with Deep Learning Method[J]. Data Analysis and Knowledge Discovery, 2021, 5(3):121-131.)
[5] Taboada M, Brooke J, Tofiloski M, et al. Lexicon-Based Methods for Sentiment Analysis[J]. Computational Linguistics, 2011, 37(2):267-307.
doi: 10.1162/COLI_a_00049
[6] Nasukawa T, Yi J. Sentiment Analysis: Capturing Favorability Using Natural Language Processing[C]// Proceedings of the 2nd International Conference on Knowledge Capture. 2003: 70-77.
[7] Boiy E, Moens M F. A Machine Learning Approach to Sentiment Analysis in Multilingual Web Texts[J]. Information Retrieval, 2009, 12(5):526-558.
doi: 10.1007/s10791-008-9070-z
[8] Kim S M, Hovy E. Extracting Opinions, Opinion Holders, Topics Expressed in Online News Media Text[C]// Proceedings of the Workshop on Sentiment and Subjectivity in Text. Association for Computational Linguistics, 2006: 1-8.
[9] 夏南强, 肖琴. 微博群体信息及其主观倾向性分析[J]. 情报科学, 2014, 32(9):22-29.
[9] (Xia Nanqiang, Xiao Qin. Study of MicroBlog Group Information and Its Subjective Tendency Analysis[J]. Information Science, 2014, 32(9):22-29.)
[10] Rao Y H, Lei J S, Liu W Y, et al. Building Emotional Dictionary for Sentiment Analysis of Online News[J]. World Wide Web, 2014, 17(4):723-742.
doi: 10.1007/s11280-013-0221-9
[11] 陈龙, 管子玉, 何金红, 等. 情感分类研究进展[J]. 计算机研究与发展, 2017, 54(6):1150-1170.
[11] (Chen Long, Guan Ziyu, He Jinhong, et al. A Survey on Sentiment Classification[J]. Journal of Computer Research and Development, 2017, 54(6):1150-1170.)
[12] Gautam G, Yadav D. Sentiment Analysis of Twitter Data Using Machine Learning Approaches and Semantic Analysis[C]// Proceedings of the 7th International Conference on Contemporary Computing (IC3). IEEE, 2014: 437-442.
[13] Sharma A, Dey S. A Boosted SVM Based Ensemble Classifier for Sentiment Analysis of Online Reviews[J]. ACM SIGAPP Applied Computing Review, 2013, 13(4):43-52.
doi: 10.1145/2577554.2577560
[14] Prabowo R, Thelwall M. Sentiment Analysis: A Combined Approach[J]. Journal of Informetrics, 2009, 3(2):143-157.
doi: 10.1016/j.joi.2009.01.003
[15] 谢丽星, 周明, 孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取[J]. 中文信息学报, 2012, 26(1):73-83.
[15] (Xie Lixing, Zhou Ming, Sun Maosong. Hierarchical Structure Based Hybrid Approach to Sentiment Analysis of Chinese Micro Blog and Its Feature Extraction[J]. Journal of Chinese Information Processing, 2012, 26(1):73-83.)
[16] 李然, 林政, 林海伦, 等. 文本情绪分析综述[J]. 计算机研究与发展, 2018, 55(1):30-52.
[16] (Li Ran, Lin Zheng, Lin Hailun, et al. Text Emotion Analysis: A Survey[J]. Journal of Computer Research and Development, 2018, 55(1):30-52.)
[17] Liao S Y, Wang J B, Yu R Y, et al. CNN for Situations Understanding Based on Sentiment Analysis of Twitter Data[J]. Procedia Computer Science, 2017, 111:376-381.
doi: 10.1016/j.procs.2017.06.037
[18] Zeng D J, Dai Y, Li F, et al. Aspect Based Sentiment Analysis by a Linguistically Regularized CNN with Gated Mechanism[J]. Journal of Intelligent & Fuzzy Systems, 2019, 36(5):3971-3980.
[19] Baktha K, Tripathy B K. Investigation of Recurrent Neural Networks in the Field of Sentiment Analysis[C]// Proceedings of 2017 International Conference on Communication and Signal Processing (ICCSP). IEEE, 2017: 2047-2050.
[20] Nguyen T H, Shirai K. PhraseRNN: Phrase Recursive Neural Network for Aspect-Based Sentiment Analysis[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015: 2509-2514.
[21] Zhou C T, Sun C L, Liu Z Y, et al. A C-LSTM Neural Network for Text Classification[OL]. arXiv Preprint, arXiv: 1511.08630.
[22] Shuang K, Zhang Z X, Guo H, et al. A Sentiment Information Collector-Extractor Architecture Based Neural Network for Sentiment Analysis[J]. Information Sciences, 2018, 467:549-558.
doi: 10.1016/j.ins.2018.08.026
[23] Cheng Y, Sun H, Chen H M, et al. Sentiment Analysis Using Multi-Head Attention Capsules with Multi-Channel CNN and Bidirectional GRU[J]. IEEE Access, 2021, 9:60383-60395.
doi: 10.1109/ACCESS.2021.3073988
[24] 程艳, 尧磊波, 张光河, 等. 基于注意力机制的多通道CNN和BiGRU的文本情感倾向性分析[J]. 计算机研究与发展, 2020, 57(12):2583-2595.
[24] (Cheng Yan, Yao Leibo, Zhang Guanghe, et al. Text Sentiment Orientation Analysis of Multi-Channels CNN and BiGRU Based on Attention Mechanism[J]. Journal of Computer Research and Development, 2020, 57(12):2583-2595.)
[25] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the Neural Information Processing Systems Conference. 2013: 3111-3119.
[26] Joulin A, Grave E, Bojanowski P, et al. Bag of Tricks for Efficient Text Classification[OL]. arXiv Preprint, arXiv: 1607.01759.
[27] Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543.
[28] Kim Y. Convolutional Neural Networks for Sentence Classification[OL]. arXiv Preprint, arXiv: 1408.5882.
[29] 李慧, 柴亚青. 基于卷积神经网络的细粒度情感分析方法[J]. 数据分析与知识发现, 2019, 3(1):95-103.
[29] (Li Hui, Chai Yaqing. Fine-Grained Sentiment Analysis Based on Convolutional Neural Network[J]. Data Analysis and Knowledge Discovery, 2019, 3(1):95-103.)
[30] Sun B H, Yang L, Sha H, et al. Multi-modal Sentiment Analysis Using Super Characters Method on Low-Power CNN Accelerator Device[OL]. arXiv Preprint, arXiv: 2001.10179.
[31] Yin W P, Schütze H. Multichannel Variable-Size Convolution for Sentence Classification[OL]. arXiv Preprint, arXiv: 1603.04513.
[32] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8):1735-1780.
pmid: 9377276
[33] Limsopatham N, Collier N. Normalising Medical Concepts in Social Media Texts by Learning Semantic Representation[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016:1014-1023.
[34] Schuster M, Paliwal K K. Bidirectional Recurrent Neural Networks[J]. IEEE Transactions on Signal Processing, 1997, 45(11):2673-2681.
doi: 10.1109/78.650093
[35] Mnih V, Heess N, Graves A, et al. Recurrent Models of Visual Attention[OL]. arXiv Preprint, arXiv: 1406.6247.
[36] Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv: 1409.0473.
[37] 余本功, 朱梦迪. 基于层级注意力多通道卷积双向GRU的问题分类研究[J]. 数据分析与知识发现, 2020, 4(8):50-62.
[37] (Yu Bengong, Zhu Mengdi. Question Classification Based on Bidirectional GRU with Hierarchical Attention and Mutil-channel Convolution[J]. Data Analysis and Knowledge Discovery, 2020, 4(8):50-62.)
[38] 陈珂, 梁斌, 柯文德, 等. 基于多通道卷积神经网络的中文微博情感分析[J]. 计算机研究与发展, 2018, 55(5):945-957.
[38] (Chen Ke, Liang Bin, Ke Wende, et al. Chinese Micro-Blog Sentiment Analysis Based on Multi-Channels Convolutional Neural Networks[J]. Journal of Computer Research and Development, 2018, 55(5):945-957.)
[39] Miculicich L, Ram D, Pappas N, et al. Document-Level Neural Machine Translation with Hierarchical Attention Networks[OL]. arXiv Preprint, arXiv: 1809.01576.
[40] 宁尚明, 滕飞, 李天瑞. 基于多通道自注意力机制的电子病历实体关系抽取[J]. 计算机学报, 2020, 43(5):916-929.
[40] (Ning Shangming, Teng Fei, Li Tianrui. Multi-channel Self-attention Mechanism for Relation Extraction in Clinical Records[J]. Chinese Journal of Computers, 2020, 43(5):916-929.)
[41] Liu R, Wei W, Mao W G, et al. Phase Conductor on Multi-layered Attentions for Machine Comprehension[OL]. arXiv Preprint, arXiv: 1710.10504.
[42] 蔡莉, 王淑婷, 刘俊晖, 等. 数据标注研究综述[J]. 软件学报, 2020, 31(2):302-320.
[42] (Cai Li, Wang Shuting, Liu Junhui, et al. Survey of Data Annotation[J]. Journal of Software, 2020, 31(2):302-320.)
[1] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2] Zhou Zeyu,Wang Hao,Zhao Zibo,Li Yueyan,Zhang Xiaoqin. Construction and Application of GCN Model for Text Classification with Associated Information[J]. 数据分析与知识发现, 2021, 5(9): 31-41.
[3] Zhao Danning,Mu Dongmei,Bai Sen. Automatically Extracting Structural Elements of Sci-Tech Literature Abstracts Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(7): 70-80.
[4] Xu Yuemei, Wang Zihou, Wu Zixin. Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[5] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[6] Zhong Jiawa,Liu Wei,Wang Sili,Yang Heng. Review of Methods and Applications of Text Sentiment Analysis[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[7] Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[8] Liu Tong,Liu Chen,Ni Weijian. A Semi-Supervised Sentiment Analysis Method for Chinese Based on Multi-Level Data Augmentation[J]. 数据分析与知识发现, 2021, 5(5): 51-58.
[9] Meng Zhen,Wang Hao,Yu Wei,Deng Sanhong,Zhang Baolong. Vocal Music Classification Based on Multi-category Feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 59-70.
[10] Wang Yuzhu,Xie Jun,Chen Bo,Xu Xinying. Multi-modal Sentiment Analysis Based on Cross-modal Context-aware Attention[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[11] Li Feifei,Wu Fan,Wang Zhongqing. Sentiment Analysis with Reviewer Types and Generative Adversarial Network[J]. 数据分析与知识发现, 2021, 5(4): 72-79.
[12] Lin Kerou,Wang Hao,Gong Lijuan,Zhang Baolong. Disambiguation of Chinese Author Names with Multiple Features[J]. 数据分析与知识发现, 2021, 5(4): 90-102.
[13] Cheng Bin,Shi Shuicai,Du Yuncheng,Xiao Shibin. Keyword Extraction for Journals Based on Part-of-Speech and BiLSTM-CRF Combined Model[J]. 数据分析与知识发现, 2021, 5(3): 101-108.
[14] Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[15] Zhang Qi,Jiang Chuan,Ji Youshu,Feng Minxuan,Li Bin,Xu Chao,Liu Liu. Unified Model for Word Segmentation and POS Tagging of Multi-Domain Pre-Qin Literature[J]. 数据分析与知识发现, 2021, 5(3): 2-11.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn