Micro-Blog Fine-Grained Sentiment Analysis Based on Multi-Feature Fusion

doi:10.11925/infotech.2096-3467.2022.1028

Data Analysis and Knowledge Discovery

2023, Vol. 7

Issue (12): 102-113 DOI: 10.11925/infotech.2096-3467.2022.1028

Current Issue | Archive | Adv Search

Micro-Blog Fine-Grained Sentiment Analysis Based on Multi-Feature Fusion

Wu Xuxu¹,Chen Peng¹(

),Jiang Huan²

¹School of Information and Cyber Security, People’s Public Security University of China, Beijing 100045, China
²School of E-Business and Logistics, Beijing Technology and Business University, Beijing 100048, China

Download: PDF (1114 KB) HTML ( 12 )
Export: BibTeX | EndNote (RIS)

Abstract

[Objective] This paper proposes an RB-LCM model to improve the fine-grained sentiment analysis of Weibo texts. [Methods] First, we used the RoBERTa to encode the character and sentence-level features of Weibo posts. Then, we utilized the Bi-LSTM and capsule network to capture in-depth global and local features of Weibo sentences. Third, we deployed multi-head self-attention feature fusion to fuse the relevant multi-dimensional features. Finally, we used improved Focal Loss and FGM to train the model and improve the dataset labels’ imbalance and the model’s robustness. [Results] The accuracy and F1 value of the proposed model on the SMP2020-EWECT dataset reached 80.64% and 77.41%. The model’s accuracy and F1 value on the NLPCC2013 task 2 dataset were 67.17% and 51.08%. The model’s accuracy and F1 value on the NLPCC2014 task 1 dataset reached 71.27% and 58.25%. The model’s accuracy and F1 value on the binary sentiment dataset weibo_senti_100k dataset were up to 98.45% and 98.44%, respectively. All results were better than the advanced sentiment analysis models on each dataset. [Limitations] Our model did not include relevant pictures, videos, voice, or other information for sentiment analysis. [Conclusions] The proposed model can effectively analyze the sentiment of Weibo posts.

Key words： RoBERTa Multi-Head Self-Attention Fusion Bi-LSTM Microblog Sentiment Analysis Capsule Network

Received: 28 September 2022 Published: 13 September 2023

ZTFLH:	TP391
	G350

Fund:Fundamental Research Funds for the Central Universities, People’s Public Security University of China Project(2022JKF02018)

Corresponding Authors: Chen Peng，E-mail：chenpeng@ppsuc.edu.cn。

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Xuxu Wu
	Peng Chen
	Huan Jiang

Cite this article:

Wu Xuxu, Chen Peng, Jiang Huan. Micro-Blog Fine-Grained Sentiment Analysis Based on Multi-Feature Fusion. Data Analysis and Knowledge Discovery, 2023, 7(12): 102-113.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.1028 OR https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I12/102

Structure of RB-LCM

Capsule Flow

Adversarial Learning

Dataset Information

Model Parameter Settings

Experiment Environment

Influence of Routing Iteration Times

Performance of Different Models on SMP2020-EWECT Dataset

Performance of Different Models on NLPCC2013 Dataset

Performance of Different Models on NLPCC2014 Dataset

Performance of Different Models on weibo_senti_100k Dataset

Performance of Ablation Experiments

[1]	周超, 严馨, 余正涛, 等. 融合词频特性及邻接变化数的微博新词识别[J]. 山东大学学报(理学版), 2015, 50(3): 6-10.
[1]	(Zhou Chao, Yan Xin, Yu Zhengtao, et al. Weibo New Word Recognition Combining Frequency Characteristic and Accessor Variety[J]. Journal of Shandong University (Natural Science), 2015, 50(3): 6-10.)
[2]	Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2014: 1746-1751.
[3]	刘新星, 姬东鸿, 任亚峰. 基于神经网络模型的产品属性情感分析[J]. 计算机应用, 2017, 37(6): 1735-1740. doi: 10.11772/j.issn.1001-9081.2017.06.1735
[3]	(Liu Xinxing, Ji Donghong, Ren Yafeng. Product Property Sentiment Analysis Based on Neural Network Model[J]. Journal of Computer Applications, 2017, 37(6): 1735-1740.) doi: 10.11772/j.issn.1001-9081.2017.06.1735
[4]	Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[5]	Peters M, Neumann M, Iyyer M, et al. Deep Contextualized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 2227-2237.
[6]	Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[OL]. arXiv Preprint, arXiv: 1810.04805.
[7]	Man R, Lin K. Sentiment Analysis Algorithm Based on BERT and Convolutional Neural Network[C]// Proceedings of the 2021 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers. IEEE, 2021: 769-772.
[8]	韩普, 张伟, 张展鹏, 等. 基于特征融合和多通道的突发公共卫生事件微博情感分析[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[8]	(Han Pu, Zhang Wei, Zhang Zhanpeng, et al. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel[J]. Data Analysis and Knowledge Discovery, 2021, 5(11): 68-79.)
[9]	王儒, 王嘉梅, 王伟全, 等. 深度学习框架下微博文本情感细粒度研究[J]. 计算机系统应用, 2020, 29(5): 19-28.
[9]	(Wang Ru, Wang Jiamei, Wang Weiquan, et al. Fine-Grained Analysis and Research of Emotion in Microtext Under Framework of Deep Learning[J]. Computer Systems & Applications, 2020, 29(5): 19-28.)
[10]	李辉, 黄钰杰, 李金秋. 基于HAN的双通道复合模型的文本情感分类[J]. 传感器与微系统, 2021, 40(8): 121-125.
[10]	(Li Hui, Huang Yujie, Li Jinqiu. Text Sentiment Classification Based on HAN and Two-Channel Composite Model[J]. Transducer and Microsystem Technologies, 2021, 40(8): 121-125.)
[11]	Sabour S, Frosst N, Hinton G E. Dynamic Routing Between Capsules[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. ACM, 2017: 3859-3869.
[12]	Yang M, Zhao W, Ye J B, et al. Investigating Capsule Networks with Dynamic Routing for Text Classification[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018: 3110-3119.
[13]	冯国明, 张晓冬, 刘素辉. 基于CapsNet的中文文本分类研究[J]. 数据分析与知识发现, 2018, 2(12): 68-76.
[13]	(Feng Guoming, Zhang Xiaodong, Liu Suhui. Classifying Chinese Texts with CapsNet[J]. Data Analysis and Knowledge Discovery, 2018, 2(12): 68-76.)
[14]	余本功, 朱晓洁, 张子薇. 基于多层次特征提取的胶囊网络文本分类研究[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
[14]	(Yu Bengong, Zhu Xiaojie, Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-Level Feature Extraction[J]. Data Analysis and Knowledge Discovery, 2021, 5(6): 93-102.)
[15]	Tong X, Wang J Y, Jiao K N, et al. Robustness Detection Method of Chinese Spam Based on the Features of Joint Characters-Words[C]// Proceedings of the 10th International Conference on Computer Engineering and Networks. Singapore: Springer, 2021: 845-851.
[16]	Chen W T, Fan C X, Wu Y X, et al. A Chinese Character-Level and Word-Level Complementary Text Classification Method[C]// Proceedings of the 2020 International Conference on Technologies and Applications of Artificial Intelligence. IEEE, 2020: 187-192.
[17]	Sangeetha K, Prabha D. Retraction Note to: Sentiment Analysis of Student Feedback Using Multi-Head Attention Fusion Model of Word and Context Embedding for LSTM[J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14(S1): Article No.S537.
[18]	India M, Safari P, Hernando J. Self Multi-Head Attention for Speaker Recognition[OL]. arXiv Preprint, arXiv: 1906.09890.
[19]	Fang Y, Gao J, Huang C, et al. Self Multi-Head Attention-Based Convolutional Neural Networks for Fake News Detection[J]. PLoS One, 2019, 14(9): Article No.e0222713.
[20]	Liu Y H, Ott M, Goyal N, et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach[OL]. arXiv Preprint, arXiv: 1907.11692.
[21]	Zhu X D, Sobhani P, Guo H Y. Long Short-Term Memory over Recursive Structures[C]// Proceedings of the 32nd International Conference on Machine Learning. ACM, 2015: 1604-1612.
[22]	Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. ACM, 2017: 6000-6010.
[23]	Lin T Y, Goyal P, Girshick R, et al. Focal Loss for Dense Object Detection[C]// Proceedings of the 2017 IEEE International Conference on Computer Vision. IEEE, 2017: 2999-3007.
[24]	Miyato T, Dai A M, Goodfellow I. Adversarial Training Methods for Semi-Supervised Text Classification[OL]. arXiv Preprint, arXiv: 1605.07725.
[25]	Jiang X C, Song C, Xu Y C, et al. Research on Sentiment Classification for Netizens Based on the BERT-BiLSTM-TextCNN Model[J]. PeerJ Computer Science, 2022, 8: Article No.e1005.
[26]	Zhou P, Shi W, Tian J, et al. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2016: 207-212.
[27]	Zhou C T, Sun C L, Liu Z Y, et al. A C-LSTM Neural Network for Text Classification[OL]. arXiv Preprint, arXiv: 1511.08630.
[28]	Lai S W, Xu L H, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. ACM, 2015: 2267-2273.
[29]	谌志群, 鞠婷. 基于BERT和双向LSTM的微博评论倾向性分析研究[J]. 情报理论与实践, 2020, 43(8): 173-177.
[29]	(Chen Zhiqun, Ju Ting. Research on Tendency Analysis of Microblog Comments Based on BER T and BLSTM[J]. Information Studies: Theory & Application, 2020, 43(8): 173-177.)
[30]	赵宏, 傅兆阳, 赵凡. 基于BERT和层次化Attention的微博情感分析研究[J]. 计算机工程与应用, 2022, 58(5): 156-162. doi: 10.3778/j.issn.1002-8331.2107-0448
[30]	(Zhao Hong, Fu Zhaoyang, Zhao Fan. Microblog Sentiment Analysis Based on BERT and Hierarchical Attention[J]. Computer Engineering and Applications, 2022, 58(5): 156-162.) doi: 10.3778/j.issn.1002-8331.2107-0448
[31]	Li L, Liu F, Huang J P. A Label Similarity Attention Mechanism for Multi-Label Emotion Recognition[C]// Proceedings of the 3rd International Conference on Electronic Communication and Artificial Intelligence. IEEE, 2022: 392-396.
[32]	Qiu H, Fan C D, Yao J, et al. Chinese Microblog Sentiment Detection Based on CNN-BiGRU and Multihead Attention Mechanism[J]. Scientific Programming, 2020, 2020: Article No.8865983.
[33]	何炎祥, 孙松涛, 牛菲菲, 等. 用于微博情感分析的一种情感语义增强的深度学习模型[J]. 计算机学报, 2017, 40(4): 773-790.
[33]	(He Yanxiang, Sun Songtao, Niu Feifei, et al. A Deep Learning Model Enhanced with Emotion Semantics for Microblog Sentiment Analysis[J]. Chinese Journal of Computers, 2017, 40(4): 773-790.)

[1]	Yan Shangyi, Wang Jingya, Liu Xiaowen, Cui Yumeng, Tao Zhizhong, Zhang Xiaofan. Microblog Sentiment Analysis with Multi-Head Self-Attention Pooling and Multi-Granularity Feature Interaction Fusion[J]. 数据分析与知识发现, 2023, 7(4): 32-45.
[2]	Zhang Shunxiang, Zhang Zhenjiang, Zhu Guangli, Zhao Tong, Huang Ju. Identifying Financial Text Causality with Bi-LSTM and Two-way CNN[J]. 数据分析与知识发现, 2022, 6(7): 118-127.
[3]	Zhang Yunqiu, Wang Yang, Li Bocheng. Identifying Named Entities of Chinese Electronic Medical Records Based on RoBERTa-wwm Dynamic Fusion Model[J]. 数据分析与知识发现, 2022, 6(2/3): 242-250.
[4]	Yan Dongmei, He Wenxin, Chen Zhi. Predicting Stock Prices Based on RoBERTa-TCN and Sentimental Characteristics[J]. 数据分析与知识发现, 2022, 6(12): 123-134.
[5]	Yu Xuehan, He Lin, Xu Jian. Extracting Events from Ancient Books Based on RoBERTa-CRF[J]. 数据分析与知识发现, 2021, 5(7): 26-35.
[6]	Yu Bengong,Zhu Xiaojie,Zhang Ziwei. A Capsule Network Model for Text Classification with Multi-level Feature Extraction[J]. 数据分析与知识发现, 2021, 5(6): 93-102.
[7]	Hu Haotian,Ji Jinfeng,Wang Dongbo,Deng Sanhong. An Integrated Platform for Food Safety Incident Entities Based on Deep Learning[J]. 数据分析与知识发现, 2021, 5(3): 12-24.
[8]	Ma Jianxia,Yuan Hui,Jiang Xiang. Extracting Name Entities from Ecological Restoration Literature with Bi-LSTM+CRF[J]. 数据分析与知识发现, 2020, 4(2/3): 78-88.
[9]	Qiang Lu,Zhenfang Zhu,Fuyong Xu,Qiangqiang Guo. Chinese Sentiment Classification Method with Bi-LSTM and Grammar Rules[J]. 数据分析与知识发现, 2019, 3(11): 99-107.
[10]	Lianjie Xiao,Tao Meng,Wei Wang,Zhixiang Wu. Entity Recognition of Intelligence Method Based on Deep Learning: Taking Area of Security Intelligence for Example[J]. 数据分析与知识发现, 2019, 3(10): 20-28.
[11]	Yuman Li,Zhibo Chen,Fu Xu. Classifying Texts with KACC Model[J]. 数据分析与知识发现, 2019, 3(10): 89-97.
[12]	Feng Guoming,Zhang Xiaodong,Liu Suhui. DBLC Model for Word Segmentation Based on Autonomous Learning[J]. 数据分析与知识发现, 2018, 2(5): 40-47.

Viewed

Full text

Abstract

Cited

Shared

Discussed