Please wait a minute...
Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (11): 46-55    DOI: 10.11925/infotech.2096-3467.2022.0751
Current Issue | Archive | Adv Search |
Sentiment Analysis of Micro-blog on Public Health Emergency with Prompt Embedding
Lai Yubin1,Chen Yan1(),Hu Xiaochun2,Huang Xin3
1School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China
2School of Big Data and Artificial Intelligence, Guangxi University of Finance and Economics, Nanning 530007, China
3College of Information Engineering, Guangxi Vocational University of Agriculture, Nanning 530007, China
Download: PDF (1082 KB)   HTML ( 12
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] At the early stage of public health emergencies, limited Weibo posts and informal expressions lead to ineffective sentiment analysis. We propose a sentiment analysis model for Weibo posts based on prompt embedding and emotion feature fusion to address this issue. [Methods] First, we extracted the sentiment information from Weibo posts based on the emotional dictionary. Then, we used the pre-trained RoBERTa model to establish semantic and sentiment vectors. We also embedded prompts as prefixes for the semantic vectors. Third, we utilized the Transformer encoder and attention mechanism to extract semantic and emotional features. We also computed the sample feature weights using the focal loss function. Finally, we combined the semantic and emotional features to conduct sentiment analysis. [Results] We examined the new model with Weibo comments on the outbreak of COVID-19 in Shenzhen. The accuracy and F1 score of the model reached 93.46% and 93.49%, which were 6.78% and 6.97% higher than the baseline BERT model. [Limitations] Weibo data contains a large amount of images and videos. However, our model did not include multi-modal fusion for sentiment analysis. [Conclusions] The proposed model could improve the effectiveness of sentiment classification with a small sample data size.

Key wordsPrompt Embedding      Feature Fusion      Few Shot      Sentiment Analysis      Public Health Emergency     
Received: 19 July 2022      Published: 22 March 2023
ZTFLH:  G350  
Fund:Guangxi Scientific Research and Technology Development Program(Grant No. 桂科AA20302002-3);Guangxi Natural Science Foundation(Grant No. 2020GXNSFAA159090)
Corresponding Authors: Chen Yan, ORCID:0000-00002-9950-684X,E-mail:cy@gxu.edu.cn。   

Cite this article:

Lai Yubin, Chen Yan, Hu Xiaochun, Huang Xin. Sentiment Analysis of Micro-blog on Public Health Emergency with Prompt Embedding. Data Analysis and Knowledge Discovery, 2023, 7(11): 46-55.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0751     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I11/46

MESA Model Structure
The Process for Extracting Emotional Information
Self-Attention Calculation Process
原始数据 处理后数据 标签
#深圳疫情#这核酸做的不是很地道啊,说好的一米一个距离呢 kk钢铁侠911的微博视频 这核酸做的不是很地道啊说好的一米一个距离呢 -1
#深圳疫情# 3月18日0-24时深圳新增77例病例,其中7例在社区筛查中发现,4例在重点人群筛查中发现,22例在重点区域筛查中发现,44例在隔离观察的密接人员排查中发现;46例诊断为新型冠状病毒感染确证病例,31例诊断为新冠病毒无症状感染者。详情请见下图↓↓↓ 3月18日0-24时深圳新增77例病例其中7例在社区筛查中发现4例在重点人群筛查中发现22例在重点区域筛查中发现44例在隔离观察的密接人员排查中发现46例诊断为新型冠状病毒感染确证病例31例诊断为新冠病毒无症状感染者 0
#深圳疫情# 加油深圳 @深圳卫健委 深圳·深圳市宝安区体育中心 加油深圳 1
Data Processing Sample
模型 P/% R/% F1/% Acc/%
BERT 87.01 86.72 86.52 86.68
ERNIE 87.67 87.50 87.26 87.72
MacBERT 87.50 89.07 88.14 88.64
RoBERTa 89.91 89.06 89.23 89.06
MESA 93.60 93.45 93.49 93.46
Results of Different Classification Models
模型 P/% R/% F1/% Acc/%
B-MESA 88.80 88.66 88.61 88.61
E-MESA 89.28 89.18 89.19 89.19
M-MESA 89.21 90.39 88.36 90.30
MESA 93.60 93.45 93.49 93.46
Comparison Results of the Word Vector Tool
Distribution Before and after Downsampling
数据集 模型 P/% R/% F1/% Acc/%
SED RoBERTa 89.91 89.06 89.23 89.06
MESA 93.60 93.45 93.49 93.46
SED0.8 RoBERTa 87.63 87.50 87.60 87.60
MESA 91.05 91.03 91.03 91.03
SED0.5 RoBERTa 84.38 84.59 84.16 84.38
MESA 88.92 89.03 88.89 89.03
SED0.2 RoBERTa 79.97 77.42 76.31 81.05
MESA 88.12 87.37 87.47 87.37
Comparison of Downsampling Results
模型 提示嵌入 情感特征分支 损失函数 P/% R/% F1/% Acc/%
RoBERTa - - C 89.91 89.06 89.23 89.06
R-P - C 90.97 90.94 90.93 90.93
R-F - FL 91.40 91.38 91.37 91.38
M-C C 92.92 92.83 92.84 92.83
MESA FL 93.60 93.45 93.49 93.46
Comparison of Ablation Results
[1] 赵宏. 疫情防控下个人的权利限缩与边界[J]. 比较法研究, 2020(2): 11-24.
[1] (Zhao Hong. Contraction and Boundary of the Individual’s Right under the Epidemic Prevention and Control[J]. Journal of Comparative Law, 2022(2): 11-24.)
[2] 刘忠宝, 秦权, 赵文娟. 微博环境下新型冠状病毒感染疫情事件对网民情绪的影响分析[J]. 情报杂志, 2021, 40(2): 138-145.
[2] (Liu Zhongbao, Qin Quan, Zhao Wenjuan. Research on the Influence of COVID-19 Event on the Netizen Emotion under the Microblog Environment[J]. Journal of Intelligence, 2021, 40(2): 138-145.)
[3] 周宁, 钟娜, 靳高雅, 等. 基于混合词嵌入的双通道注意力网络中文文本情感分析[J]. 数据分析与知识发现, 2023, 7(3): 58-68.
[3] (Zhou Ning, Zhong Na, Jin Gaoya, et al. Chinese Text Sentiment Analysis Based on Dual Channel Attention Network with Hybrid Word Embedding[J]. Data Analysis and Knowledge Discovery, 2023, 7(3): 58-68.)
[4] 韩普, 张伟, 张展鹏, 等. 基于特征融合和多通道的突发公共卫生事件微博情感分析[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[4] (Han Pu, Zhang Wei, Zhang Zhanpeng, et al. Sentiment Analysis of Weibo Posts on Public Health Emergency with Feature Fusion and Multi-Channel[J]. Data Analysis and Knowledge Discovery, 2021, 5(11): 68-79.)
[5] Liu P F, Yuan W Z, Fu J L, et al. Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing[OL]. arXiv Preprint, arXiv: 2107.13586.
[6] Brown T B, Mann B, Ryder N, et al. Language Models are Few-Shot Learners[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020: 1877-1901.
[7] Liu Y H, Ott M, Goyal N, et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach[OL]. arXiv Preprint, arXiv: 1907.11692.
[8] 沈彬, 严馨, 周丽华, 等. 基于ERNIE和双重注意力机制的微博情感分析[J]. 云南大学学报(自然科学版), 2022, 44(3): 480-489.
[8] Shen Bin, Yan Xin, Zhou Lihua, et al. Microblog Sentiment Analysis Based on ERNIE and Dual Attention Mechanism[J]. Journal of Yunnan University(Natural Sciences Edition), 2022, 44(3): 480-489.)
[9] 钟佳娃, 刘巍, 王思丽, 等. 文本情感分析方法及应用综述[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[9] (Zhong Jiawa, Liu Wei, Wang Sili, et al. Review of Methods and Applications of Text Sentiment Analysis[J]. Data Analysis and Knowledge Discovery, 2021, 5(6): 1-13.)
[10] 王婷, 杨文忠. 文本情感分析方法研究综述[J]. 计算机工程与应用, 2021, 57(12): 11-24.
doi: 10.3778/j.issn.1002-8331.2101-0022
[10] (Wang Ting, Yang Wenzhong. Review of Text Sentiment Analysis Methods[J]. Computer Engineering and Applications, 2021, 57(12): 11-24.)
doi: 10.3778/j.issn.1002-8331.2101-0022
[11] Kim Y. Convolutional Neural Networks for Sentence Classification[OL]. arXiv Preprint, arXiv:1408.5882.
[12] 岳增营, 叶霞, 刘睿珩. 基于语言模型的预训练技术研究综述[J]. 中文信息学报, 2021, 35(9): 15-29.
[12] (Yue Zengying, Ye Xia, Liu Ruiheng. A Survey of Language Model Based Pre-Training Technology[J]. Journal of Chinese Information Processing, 2021, 35(9): 15-29.)
[13] Munikar M, Shakya S, Shrestha A. Fine-Grained Sentiment Classification Using BERT[OL]. arXiv Preprint, arXiv: 1910.03474.
[14] 齐梦娜, 朱丽平, 李宁. 基于ERNIE和CNN的在线评论情感分析模型[J]. 计算机应用, 2022, 42(S1): 7-11.
[14] (Qi Mengna, Zhu Liping, Li Ning. Sentiment Analysis Model of Commodity Reviews Based on ERNIE and CNN[J]. Journal of Computer Applications, 2022, 42(S1): 7-11.)
[15] 王曙燕, 原柯. 基于RoBERTa-WWM的大学生论坛情感分析模型[J]. 计算机工程, 2022, 48(8): 292-298.
doi: 10.19678/j.issn.1000-3428.0062008
[15] (Wang Shuyan, Yuan Ke. Sentiment Analysis Model of College Student Forum Based on RoBERTa-WWM[J]. Computer Engineering, 2022, 48(8): 292-298.)
doi: 10.19678/j.issn.1000-3428.0062008
[16] Gu Y, Han X, Liu Z, et al. PPT: Pre-trained Prompt Tuning for Few-shot Learning[OL]. arXiv Preprint, arXiv: 2109.04332.
[17] 张博旭, 蒲智, 程曦. 基于提示学习的维吾尔语文本分类研究[J]. 计算机工程, 2023, 49(6): 292-299.
doi: 10.19678/j.issn.1000-3428.0064892
[17] (Zhang Boxu, Pu Zhi, Cheng Xi. Research on Uyghur Text Classification Based on Prompt Learning[J]. Computer Engineering, 2023, 49(6): 292-299.)
doi: 10.19678/j.issn.1000-3428.0064892
[18] 陈诺, 李旭辉. 一种基于模板提示学习的事件抽取方法[J]. 数据分析与知识发现, 2023, 7(6): 86-98.
[18] (Chen Nuo, Li Xuhui. An Event Extraction Method Based on Template Prompt Learning[J]. Data Analysis and Knowledge Discovery, 2023, 7(6): 86-98.)
[19] 苏杭, 胡亚豪, 谢艺菲, 等. 利用提示调优实现两阶段模型复用的关系实体抽取方法[J]. 计算机应用研究, 2022, 39(12): 3598-3604.
[19] (Su Hang, Hu Yahao, Xie Yifei, et al. Model-Reused Method of Two-Stage Relations and Entities Extraction with Prompt Tuning[J]. Application Research of Computers, 2022, 39(12): 3598-3604.)
[20] Liu X, Zheng Y N, Du Z X, et al. GPT Understands, too[OL]. arXiv Preprint, arXiv: 2103.10385.
[21] Lester B, Al-Rfou R, Constant N. The Power of Scale for Parameter-Efficient Prompt Tuning[OL]. arXiv Preprint, arXiv: 2104.08691.
[22] Liu X, Ji K X, Fu Y C, et al. P-Tuning v2: Prompt Tuning can be Comparable to Fine-Tuning Universally Across Scales and Tasks[OL]. arXiv Preprint, arXiv: 2110.07602.
[23] Li X L, Liang P. Prefix-Tuning: Optimizing Continuous Prompts for Generation[OL]. arXiv Preprint, arXiv: 2101.00190.
[24] Lei Z Y, Yang Y J, Yang M, et al. A Multi-Sentiment-Resource Enhanced Attention Network for Sentiment Classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers). 2018: 758-763.
[25] 张仰森, 郑佳, 黄改娟, 等. 基于双重注意力模型的微博情感分析方法[J]. 清华大学学报(自然科学版), 2018, 58(2): 122-130.
[25] (Zhang Yangsen, Zheng Jia, Huang Gaijuan, et al. Microblog Sentiment Analysis Method Based on a Double Attention Model[J]. Journal of Tsinghua University (Science and Technology), 2018, 58(2): 122-130.)
[26] Lin T Y, Goyal P, Girshick R, et al. Focal Loss for Dense Object Detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327.
doi: 10.1109/TPAMI.34
[27] Devlin J, Chang M W, Lee K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1(Long and Short Papers). 2019: 4171-4186.
[28] Sun Y, Wang S H, Li Y K, et al. ERNIE: Enhanced Representation Through Knowledge Integration[OL]. arXiv Preprint,arXiv:1904.09223.
[29] Cui Y, Che W, Liu T, et al. Revisiting Pre-Trained Models for Chinese Natural Language Processing[OL]. arXiv Preprint,arXiv: 2004.13922.
[1] Li Kaijun, Niu Zhendong, Shi Kaize, Qiu Ping. Paper Recommendation Based on Academic Knowledge Graph and Subject Feature Embedding[J]. 数据分析与知识发现, 2023, 7(5): 48-59.
[2] Pan Huali, Xie Jun, Gao Jing, Xu Xinying, Wang Changzheng. A Deep Reinforcement Learning Recommendation Model with Multi-modal Features[J]. 数据分析与知识发现, 2023, 7(4): 114-128.
[3] Deng Na, He Xinyang, Chen Weijie, Chen Xu. MPMFC: A Traditional Chinese Medicine Patent Classification Model Integrating Network Neighborhood Structural Features and Patent Semantic Features[J]. 数据分析与知识发现, 2023, 7(4): 145-158.
[4] Yan Shangyi, Wang Jingya, Liu Xiaowen, Cui Yumeng, Tao Zhizhong, Zhang Xiaofan. Microblog Sentiment Analysis with Multi-Head Self-Attention Pooling and Multi-Granularity Feature Interaction Fusion[J]. 数据分析与知识发现, 2023, 7(4): 32-45.
[5] Zhang Yu, Zhang Haijun, Liu Yaqing, Liang Kejin, Wang Yueyang. Multimodal Sentiment Analysis Based on Bidirectional Mask Attention Mechanism[J]. 数据分析与知识发现, 2023, 7(4): 46-55.
[6] Li Haojun, Lv Yun, Wang Xuhui, Huang Jieya. A Deep Recommendation Model with Multi-Layer Interaction and Sentiment Analysis[J]. 数据分析与知识发现, 2023, 7(3): 43-57.
[7] Zhou Ning, Zhong Na, Jin Gaoya, Liu Bin. Chinese Text Sentiment Analysis Based on Dual Channel Attention Network with Hybrid Word Embedding[J]. 数据分析与知识发现, 2023, 7(3): 58-68.
[8] Shen Lining, Yang Jiayi, Pei Jiaxuan, Cao Guang, Chen Gongzheng. A Fine-Grained Sentiment Recognition Method Based on OCC Model and Triggering Events[J]. 数据分析与知识发现, 2023, 7(2): 72-85.
[9] Wang Hao, Gong Lijuan, Zhou Zeyu, Fan Tao, Wang Yongsheng. Detecting Mis/Dis-information from Social Media with Semantic Enhancement[J]. 数据分析与知识发现, 2023, 7(2): 48-60.
[10] Wu Sisi, Ma Jing. Multi-task & Multi-modal Sentiment Analysis Model Based on Aware Fusion[J]. 数据分析与知识发现, 2023, 7(10): 74-84.
[11] Xu Yuemei, Cao Han, Wang Wenqing, Du Wanze, Xu Chengyang. Cross-Lingual Sentiment Analysis: A Survey[J]. 数据分析与知识发现, 2023, 7(1): 1-21.
[12] Xiao Yuhan, Lin Huiping. Mining Differentiated Demands with Aspect Word Extraction: Case Study of Smartphone Reviews[J]. 数据分析与知识发现, 2023, 7(1): 63-75.
[13] Bian Xiaohui, Xu Tong. Evolution of Public Sentiments During COVID-19 Pandemic[J]. 数据分析与知识发现, 2022, 6(7): 128-140.
[14] Yang Wenli, Li Nana. A Text-Aligned Cross-Language Sentiment Classification Method Based on Adversarial Networks[J]. 数据分析与知识发现, 2022, 6(7): 141-151.
[15] Xiao Hanqiong, Zhang Xinyu, Xiao Yuhan, Lin Huiping. Creating Consumer Psychology Portrait with Aspect Words[J]. 数据分析与知识发现, 2022, 6(6): 22-31.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn