Please wait a minute...
Advanced Search
数据分析与知识发现  2024, Vol. 8 Issue (3): 77-84     https://doi.org/10.11925/infotech.2096-3467.2023.0004
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于提示学习增强的文本情感分类模型*
黄泰峰,马静()
南京航空航天大学经济与管理学院 南京 211106
Text Sentiment Classification Algorithm Based on Prompt Learning Enhancement
Huang Taifeng,Ma Jing()
College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
全文: PDF (760 KB)   HTML ( 7
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】解决在样本量不足的情况下,使用预训练模型进行情感分类准确率偏低的问题。【方法】提出一种基于提示学习增强的情感分类模型Pe-RoBERTa,以RoBERTa模型为基础,使用不同于传统微调方法的集成提示方法,通过提示帮助模型进一步理解下游任务,改善模型对文本情感特征的提取能力。【结果】在多个公开的中英文情感分类数据集上的实验表明,少样本场景下模型的平均情感分类准确率为93.2%,相较于传统微调和离散型提示,准确率分别提升13.8%和8.1%个百分点。【局限】处理的数据模态仅限于文本形式,目标任务主要为情感二分类任务,没有做细粒度更高的情感分类任务。【结论】Pe-RoBERTa模型能够有效地进行文本情感特征的提取,在多个情感分类任务中取得较高的准确率。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
黄泰峰
马静
关键词 Pe-RoBERTa情感分类提示学习特征提取    
Abstract

[Objective] This paper aims to improve the low accuracy of sentiment classification using the pre-trained model with insufficient samples.[Methods] We proposed a prompt learning enhanced sentiment classification algorithm Pe(prompt ensemble)-RoBERTa. It modified the RoBERTa model with integrated prompts different from the traditional fine-tuning methods. The new model could understand the downstream tasks and extract the text’s sentiment features. [Results] We examined the model on several publicly accessible Chinese and English datasets. The average sentiment classification accuracy of the model reached 93.2% with fewer samples. Compared with fine-tuned and discrete prompts, our new model’s accuracy improved by 13.8% and 8.1%, respectively. [Limitations] The proposed model only processes texts for the sentiment dichotomization tasks. It did not involve the more fine-grained sentiment classification tasks. [Conclusions] The Pe-RoBERTa model can extract text sentiment features and achieve high accuracy in sentiment classification tasks.

Key wordsPe-RoBERTa    Sentiment Classification    Prompt Learning    Feature Extraction
收稿日期: 2023-01-03      出版日期: 2024-01-08
ZTFLH:  TP391  
基金资助:* 国家科学自然基金项目(72174086);南京航空航天大学研究生科研与实践创新项目(xcxih20220910)
通讯作者: 马静,ORCID:0000-0001-8472-2518,E-mail:majing5525@126.com。   
引用本文:   
黄泰峰, 马静. 基于提示学习增强的文本情感分类模型*[J]. 数据分析与知识发现, 2024, 8(3): 77-84.
Huang Taifeng, Ma Jing. Text Sentiment Classification Algorithm Based on Prompt Learning Enhancement. Data Analysis and Knowledge Discovery, 2024, 8(3): 77-84.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2023.0004      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2024/V8/I3/77
Fig.1  Pe-RoBERTa模型结构
文本 情感类别
it’s not just a feel-good movie, it’s a feel movie. 积极
it lacks the compassion, humor and the level of insight that made first film... 消极
住这边挺方便的,周围餐馆,商场什么的都有,装修也还不错。 积极
没有比这更差的酒店了,房间灯光暗淡,空调无法调节,前台服务僵化。 消极
Table 1  数据集示例
参数名称 参数值
Encoder 层数 12
隐藏层单元数 768
注意力机制头 12
词典容量 21 128
隐藏层激活函数 ReLU
Table 2  RoBERTa 网络模型参数
参数名称 参数值
Batchsize 8
学习率 2e-5
优化器 AdamW
预热学习率 0.01
权重衰减 0.01
Table 3  模型训练参数
样本量 模型 Acc/% 平均值/
%
IMDB SST-2 ChnSentiCorp
K=32 RoBERTa微调 78.4 80.6 79.3 79.4
离散型提示 83.8 85.1 86.4 85.1
LM-BFF 92.3 92.6 91.5 92.1
Pe-RoBERTa 93.3 93.9 92.6 93.2
K=256 RoBERTa微调 82.4 84.8 83.2 83.5
离散型提示 83.8 85.1 86.4 85.1
LM-BFF 92.6 93.0 92.1 92.6
Pe-RoBERTa 93.8 94.1 92.7 93.5
K=全部 RoBERTa微调 95.7 96.7 95.2 95.9
离散型提示 83.8 85.1 86.4 85.1
LM-BFF 93.9 94.8 94.0 94.2
Pe-RoBERTa 94.6 95.5 94.7 94.9
Table 4  模型性能对比
模型 Acc/% 平均值/%
IMDB SST-2 ChnSentiCorp
RoBERTa微调 78.4 80.6 79.3 79.4
+离散型提示 83.8 85.1 85.4 85.1
+连续型提示 89.4 91.2 90.7 90.4
Table 5  消融实验
[1] 洪巍, 李敏. 文本情感分析方法研究综述[J]. 计算机工程与科学, 2019, 41(4): 750-757.
[1] (Hong Wei, Li Min. A Review: Text Sentiment Analysis Methods[J]. Computer Engineering & Science, 2019, 41(4): 750-757.)
[2] 曾慧玲, 李琳, 吕思洋, 等. 提示学习驱动的新闻舆情风险识别方法研究[J]. 计算机工程与应用, 2024, 60(1): 182-188.
doi: 10.3778/j.issn.1002-8331.2208-0295
[2] (Zeng Huiling, Li Lin, Lyu Siyang, et al. Risk Identification Method for News Public Opinion Driven by Prompt Learning[J]. Computer Engineering and Applications, 2024, 60(1): 182-188.)
doi: 10.3778/j.issn.1002-8331.2208-0295
[3] 万家山, 吴云志. 基于深度学习的文本分类方法研究综述[J]. 天津理工大学学报, 2021, 37(2): 41-47.
[3] (Wan Jiashan, Wu Yunzhi. Review of Text Classification Research Based on Deep Learning[J]. Journal of Tianjin University of Technology, 2021, 37(2): 41-47.)
[4] 李杰, 李欢. 基于深度学习的短文本评论产品特征提取及情感分类研究[J]. 情报理论与实践, 2018, 41(2): 143-148.
[4] (Li Jie, Li Huan. Research on Product Feature Extraction and Sentiment Classification of Short Online Review Based on Deep Learning[J]. Information Studies: Theory & Application, 2018, 41(2): 143-148.)
[5] Turian J, Ratinov L A, Bengio Y. Word Representations: A Simple and General Method for Semi-supervised Learning[C]// Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. 2010: 384-394.
[6] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[OL]. arXiv Preprint, arXiv: 1310.4546.
[7] Pennington J, Socher R, Manning C. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543.
[8] Bojanowski P, Grave E, Joulin A, et al. Enriching Word Vectors with Subword Information[J]. Transactions of the Association for Computational Linguistics, 2017, 5: 135-146.
doi: 10.1162/tacl_a_00051
[9] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1 (Long and Short Papers). 2019: 4171-4186.
[10] Liu Y H, Ott M, Goyal N, et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach[OL]. arXiv Preprint, arXiv:1907.11692.
[11] Lewis M, Liu Y H, Goyal N, et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 7871-7880.
[12] Dong L, Yang N, Wang W H, et al. Unified Language Model Pre-training for Natural Language Understanding and Generation[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019: 13063-13075.
[13] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[OL]. arXiv Preprint, arXiv:1706.03762.
[14] Howard J, Ruder S. Universal Language Model Fine-Tuning for Text Classification[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2018: 328-339.
[15] Brown T B, Mann B, Ryder N, et al. Language Models are Few-Shot Learners[OL]. arXiv Preprint, arXiv: 2005.14165.
[16] 赵凯琳, 靳小龙, 王元卓. 小样本学习研究综述[J]. 软件学报, 2021, 32(2): 349-369.
[16] (Zhao Kailin, Jin Xiaolong, Wang Yuanzhuo. Survey on Few-Shot Learning[J]. Journal of Software, 2021, 32(2): 349-369.)
[17] Li X L, Liang P. Prefix-Tuning: Optimizing Continuous Prompts for Generation[OL]. arXiv Preprint, arXiv: 2101.00190.
[18] Cui L Y, Wu Y, Liu J, et al. Template-Based Named Entity Recognition Using BART[C]// Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021. 2021: 1835-1845.
[19] Petroni F, Rocktäschel T, Riedel S, et al. Language Models as Knowledge Bases?[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 2463-2473.
[20] Shin T, Razeghi Y, Logan IV R L, et al. AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. 2020: 4222-4235.
[21] Hu S D, Ding N, Wang H D, et al. Knowledgeable Prompt-Tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification[C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2022: 2225-2240.
[22] Gao T Y, Fisch A, Chen D Q. Making Pre-trained Language Models Better Few-Shot Learners[OL]. arXiv Preprint, arXiv: 2012.15723.
[23] Wallace E, Feng S, Kandpal N, et al. Universal Adversarial Triggers for Attacking and Analyzing NLP[OL]. arXiv Preprint, arXiv: 1908.07125.
[24] Yuan W Z, Neubig G, Liu P F. BARTScore: Evaluating Generated Text as Text Generation[OL]. arXiv Preprint, arXiv: 2106.11520.
[25] Raffel C, Shazeer N, Roberts A, et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer[OL]. arXiv Preprint, arXiv: 1910.10683.
[26] 李南星. 基于BERT和提示学习的改进句向量文本表示[D]. 汕头: 汕头大学, 2022.
[26] (Li Nanxing. Improved Sentence Embedding Based on BERT and Prompt-Learning[D]. Shantou: Shantou University, 2022.)
[27] Liu X, Zheng Y N, Du Z X, et al. GPT Understands, Too[OL]. arXiv Preprint, arXiv: 2103.10385.
[28] Zhong Z X, Friedman D, Chen D Q. Factual Probing is [MASK]: Learning vs. Learning to Recall[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2021: 5017-5033.
[29] Lester B, Al-Rfou R, Constant N. The Power of Scale for Parameter-Efficient Prompt Tuning[OL]. arXiv Preprint, arXiv: 2104.08691.
[30] Paszke A, Gross S, Massa F, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019: 8026-8037.
[31] Wolf T, Debut L, Sanh V, et al. HuggingFace’s Transformers: State-of-the-Art Natural Language Processing[OL]. arXiv Preprint, arXiv: 1910.03771.
[32] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
[32] (Zhou Zhihua. Machine Learning[M]. Beijing: Tsinghua University Press, 2016.)
[1] 薛刚, 刘世峰, 宫大庆, 张培, 刘忠良. 利用多源数据识别城市轨道交通个体异常乘车行为*[J]. 数据分析与知识发现, 2023, 7(7): 46-57.
[2] 陈诺, 李旭晖. 一种基于模板提示学习的事件抽取方法*[J]. 数据分析与知识发现, 2023, 7(6): 86-98.
[3] 崔焕庆, 杨峻铸, 宋玮情. 基于相似特征和关系图优化的姓名消歧*[J]. 数据分析与知识发现, 2023, 7(5): 71-80.
[4] 李岱峰, 林凯欣, 李栩婷. 基于提示学习与T5 PEGASUS的图书宣传自动摘要生成器*[J]. 数据分析与知识发现, 2023, 7(3): 121-130.
[5] 史丽丽, 林军, 朱桂阳. 基于混合神经网络的中文在线评论产品特征提取及消费者需求分析*[J]. 数据分析与知识发现, 2023, 7(10): 63-73.
[6] 曹喆, 郭慧兰, 吴江, 胡忠义. 元宇宙的理想与现实:基于评论挖掘的VR产品用户感知研究*[J]. 数据分析与知识发现, 2023, 7(1): 49-62.
[7] 张卫, 王昊, 陈玥彤, 范涛, 邓三鸿. 融合迁移学习与文本增强的中文成语隐喻知识识别与关联研究*[J]. 数据分析与知识发现, 2022, 6(2/3): 167-183.
[8] 谢豪,毛进,李纲. 基于多层语义融合的图文信息情感分类研究*[J]. 数据分析与知识发现, 2021, 5(6): 103-114.
[9] 李菲菲,吴璠,王中卿. 基于生成式对抗网络和评论专业类型的情感分类研究 *[J]. 数据分析与知识发现, 2021, 5(4): 72-79.
[10] 周文远, 王名扬, 井钰. 基于AttentionSBGMC模型的引文情感和引文目的自动分类研究*[J]. 数据分析与知识发现, 2021, 5(12): 48-59.
[11] 郑新曼, 董瑜. 基于科技政策文本的程度词典构建研究*[J]. 数据分析与知识发现, 2021, 5(10): 81-93.
[12] 祁瑞华,简悦,郭旭,关菁华,杨明昕. 融合特征与注意力的跨领域产品评论情感分析*[J]. 数据分析与知识发现, 2020, 4(12): 85-94.
[13] 蔡婧璇,吴江,王诚坤. 基于深度学习的众测报告有用性预测研究*[J]. 数据分析与知识发现, 2020, 4(11): 102-111.
[14] 李纲,周华阳,毛进,陈思菁. 基于机器学习的社交媒体用户分类研究 *[J]. 数据分析与知识发现, 2019, 3(8): 1-9.
[15] 文秀贤,徐健. 基于用户评论的商品特征提取及特征价格研究 *[J]. 数据分析与知识发现, 2019, 3(7): 42-51.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn