Micro-Blog Fine-Grained Sentiment Analysis Based on Multi-Feature Fusion
Wu Xuxu1,Chen Peng1(),Jiang Huan2
1School of Information and Cyber Security, People’s Public Security University of China, Beijing 100045, China 2School of E-Business and Logistics, Beijing Technology and Business University, Beijing 100048, China
[Objective] This paper proposes an RB-LCM model to improve the fine-grained sentiment analysis of Weibo texts. [Methods] First, we used the RoBERTa to encode the character and sentence-level features of Weibo posts. Then, we utilized the Bi-LSTM and capsule network to capture in-depth global and local features of Weibo sentences. Third, we deployed multi-head self-attention feature fusion to fuse the relevant multi-dimensional features. Finally, we used improved Focal Loss and FGM to train the model and improve the dataset labels’ imbalance and the model’s robustness. [Results] The accuracy and F1 value of the proposed model on the SMP2020-EWECT dataset reached 80.64% and 77.41%. The model’s accuracy and F1 value on the NLPCC2013 task 2 dataset were 67.17% and 51.08%. The model’s accuracy and F1 value on the NLPCC2014 task 1 dataset reached 71.27% and 58.25%. The model’s accuracy and F1 value on the binary sentiment dataset weibo_senti_100k dataset were up to 98.45% and 98.44%, respectively. All results were better than the advanced sentiment analysis models on each dataset. [Limitations] Our model did not include relevant pictures, videos, voice, or other information for sentiment analysis. [Conclusions] The proposed model can effectively analyze the sentiment of Weibo posts.
(Zhou Chao, Yan Xin, Yu Zhengtao, et al. Weibo New Word Recognition Combining Frequency Characteristic and Accessor Variety[J]. Journal of Shandong University (Natural Science), 2015, 50(3): 6-10.)
[2]
Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2014: 1746-1751.
(Liu Xinxing, Ji Donghong, Ren Yafeng. Product Property Sentiment Analysis Based on Neural Network Model[J]. Journal of Computer Applications, 2017, 37(6): 1735-1740.)
doi: 10.11772/j.issn.1001-9081.2017.06.1735
[4]
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[5]
Peters M, Neumann M, Iyyer M, et al. Deep Contextualized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 2227-2237.
[6]
Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[7]
Man R, Lin K. Sentiment Analysis Algorithm Based on BERT and Convolutional Neural Network[C]// Proceedings of the 2021 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers. IEEE, 2021: 769-772.
(Han Pu, Zhang Wei, Zhang Zhanpeng, et al. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel[J]. Data Analysis and Knowledge Discovery, 2021, 5(11): 68-79.)
(Wang Ru, Wang Jiamei, Wang Weiquan, et al. Fine-Grained Analysis and Research of Emotion in Microtext Under Framework of Deep Learning[J]. Computer Systems & Applications, 2020, 29(5): 19-28.)
(Li Hui, Huang Yujie, Li Jinqiu. Text Sentiment Classification Based on HAN and Two-Channel Composite Model[J]. Transducer and Microsystem Technologies, 2021, 40(8): 121-125.)
[11]
Sabour S, Frosst N, Hinton G E. Dynamic Routing Between Capsules[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. ACM, 2017: 3859-3869.
[12]
Yang M, Zhao W, Ye J B, et al. Investigating Capsule Networks with Dynamic Routing for Text Classification[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 3110-3119.
(Yu Bengong, Zhu Xiaojie, Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-Level Feature Extraction[J]. Data Analysis and Knowledge Discovery, 2021, 5(6): 93-102.)
[15]
Tong X, Wang J Y, Jiao K N, et al. Robustness Detection Method of Chinese Spam Based on the Features of Joint Characters-Words[C]// Proceedings of the 10th International Conference on Computer Engineering and Networks. Singapore: Springer, 2021: 845-851.
[16]
Chen W T, Fan C X, Wu Y X, et al. A Chinese Character-Level and Word-Level Complementary Text Classification Method[C]// Proceedings of the 2020 International Conference on Technologies and Applications of Artificial Intelligence. IEEE, 2020: 187-192.
[17]
Sangeetha K, Prabha D. Retraction Note to: Sentiment Analysis of Student Feedback Using Multi-Head Attention Fusion Model of Word and Context Embedding for LSTM[J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14(S1): Article No.S537.
[18]
India M, Safari P, Hernando J. Self Multi-Head Attention for Speaker Recognition[OL]. arXiv Preprint, arXiv: 1906.09890.
[19]
Fang Y, Gao J, Huang C, et al. Self Multi-Head Attention-Based Convolutional Neural Networks for Fake News Detection[J]. PLoS One, 2019, 14(9): Article No.e0222713.
[20]
Liu Y H, Ott M, Goyal N, et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach[OL]. arXiv Preprint, arXiv: 1907.11692.
[21]
Zhu X D, Sobhani P, Guo H Y. Long Short-Term Memory over Recursive Structures[C]// Proceedings of the 32nd International Conference on Machine Learning. ACM, 2015: 1604-1612.
[22]
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. ACM, 2017: 6000-6010.
[23]
Lin T Y, Goyal P, Girshick R, et al. Focal Loss for Dense Object Detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. IEEE, 2017: 2999-3007.
[24]
Miyato T, Dai A M, Goodfellow I. Adversarial Training Methods for Semi-Supervised Text Classification[OL]. arXiv Preprint, arXiv: 1605.07725.
[25]
Jiang X C, Song C, Xu Y C, et al. Research on Sentiment Classification for Netizens Based on the BERT-BiLSTM-TextCNN Model[J]. PeerJ Computer Science, 2022, 8: Article No.e1005.
[26]
Zhou P, Shi W, Tian J, et al. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2016: 207-212.
[27]
Zhou C T, Sun C L, Liu Z Y, et al. A C-LSTM Neural Network for Text Classification[OL]. arXiv Preprint, arXiv: 1511.08630.
[28]
Lai S W, Xu L H, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. ACM, 2015: 2267-2273.
(Chen Zhiqun, Ju Ting. Research on Tendency Analysis of Microblog Comments Based on BER T and BLSTM[J]. Information Studies: Theory & Application, 2020, 43(8): 173-177.)
(Zhao Hong, Fu Zhaoyang, Zhao Fan. Microblog Sentiment Analysis Based on BERT and Hierarchical Attention[J]. Computer Engineering and Applications, 2022, 58(5): 156-162.)
doi: 10.3778/j.issn.1002-8331.2107-0448
[31]
Li L, Liu F, Huang J P. A Label Similarity Attention Mechanism for Multi-Label Emotion Recognition[C]// Proceedings of the 3rd International Conference on Electronic Communication and Artificial Intelligence. IEEE, 2022: 392-396.
[32]
Qiu H, Fan C D, Yao J, et al. Chinese Microblog Sentiment Detection Based on CNN-BiGRU and Multihead Attention Mechanism[J]. Scientific Programming, 2020, 2020: Article No.8865983.
(He Yanxiang, Sun Songtao, Niu Feifei, et al. A Deep Learning Model Enhanced with Emotion Semantics for Microblog Sentiment Analysis[J]. Chinese Journal of Computers, 2017, 40(4): 773-790.)