Please wait a minute...
Advanced Search
数据分析与知识发现  2023, Vol. 7 Issue (11): 37-45     https://doi.org/10.11925/infotech.2096-3467.2022.0949
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于块注意力机制和Involution的文本情感分析模型*
林哲(),陈平华
广东工业大学计算机学院 广州 510006
Analyzing Text Sentiments Based on Patch Attention and Involution
Lin Zhe(),Chen Pinghua
School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
全文: PDF (795 KB)   HTML ( 12
输出: BibTeX | EndNote (RIS)      
摘要 

目的】 解决卷积核宽度与词向量维度相同使得卷积层参数过多的问题,解决卷积操作的稀疏连接以及卷积的空间不变性和通道特异性不适用于文本任务的问题。【方法】 提出一种基于块注意力机制和Involution的文本情感分析模型。模型先对分词后的单个词向量进行变形,将一维词向量变形为n×n词矩阵块,然后将句子中多个词的词矩阵块拼接成句子矩阵。句子矩阵经过块注意力机制层,增强了文本特征的上下文相关性及位序信息,再通过采用具有空间特异性和通道不变性的Involution对句子矩阵进行特征提取,最后使用全连接层进行文本情感分类。【结果】 在三个文本情感分析公开数据集waimai_10k、IMDB、Tweet上的实验表明,所提模型的分类准确率分别达到88.47%、86.22%、94.42%,与词向量卷积网络和循环神经网络中的Bi-LSTM模型相比准确率分别提高6.47、7.72、9.35个百分点和1.07、1.01、0.59个百分点。【局限】 所提模型在大型数据集上的分类准确度低于中小型数据集。【结论】 引入块注意力机制和Involution的文本情感分析模型解决了参数量过多、卷积操作的稀疏连接以及卷积的空间不变性和通道特异性的问题,在不同数据集上,与传统卷积模型比较,本文模型的准确率有所提升。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
林哲
陈平华
关键词 文本情感分析词向量变形块注意力机制Involution    
Abstract

[Objective] Once the width of the convolution kernel is the same as the dimension of the word vector, the convolution layer will have too many parameters. The sparse connection of convolution operation, the spatial invariance, and the channel specificity of convolution are not suitable for text tasks. This paper will address these issues. [Methods] We proposed a sentiment analysis model for texts based on patch attention mechanism and Involution. The model first transformed the single-word vector after word segmentation and transformed the one-dimensional word vector into n×n word matrix blocks. Then, we spliced the word matrix blocks of multiple words in the sentence into a sentence matrix. Third, the patch attention mechanism layer enhanced the sentence matrix’s context relevance and position order information of text features. Fourth, we used the involution with spatial specificity and channel invariance to extract the sentence matrix features. Finally, we used the full connection layer for text sentiment classification. [Results] We examined the proposed model with three public data sets waimai_10k, IMDB, and Tweet. Its classification precision reached 88.47%, 86.22%, and 94.42%, respectively, which were 6.47%, 7.72%, 9.35% and 1.07%, 1.01%, 0.59% higher than Bi-LSTM model in word vector convolution network and recurrent neural network. [Limitations] The classification accuracy of this model on large datasets is not as high as on small and medium-sized datasets. [Conclusions] The proposed model solves the problems of excessive parameters, sparse connection of convolution operation, spatial invariance, and channel specificity of convolution, which yield better performance than the traditional convolution models.

Key wordsText Emotion Analysis    Word Vector Deformation    Patch Attention    Involution
收稿日期: 2022-09-09      出版日期: 2023-03-28
ZTFLH:  TP393 G350  
基金资助:*广东省重点领域研发计划的研究成果之一(2020B0101100001);广东省重点领域研发计划的研究成果之一(2021B0101200002)
通讯作者: 林哲,ORCID:0000-0003-0894-5159,E-mail:1417505493@qq.com。   
引用本文:   
林哲, 陈平华. 基于块注意力机制和Involution的文本情感分析模型*[J]. 数据分析与知识发现, 2023, 7(11): 37-45.
Lin Zhe, Chen Pinghua. Analyzing Text Sentiments Based on Patch Attention and Involution. Data Analysis and Knowledge Discovery, 2023, 7(11): 37-45.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0949      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2023/V7/I11/37
Fig.1  PATT-INN模型框架
Fig.2  块注意力机制模块
数据集 语言 正向评价 负向评价 中性评价
waimai_10k 中文 4 000 8 000 0
IMDB 英文 12 500 12 500 0
Tweet 2 363 9 178 3 099
Table 1  数据集
参数
Involution核个数 32
Involution核大小 3×3
激活函数 tanh
Dropout rate 0.5
Batchsize 100
Epoch 15
Table 2  实验参数设置
模型 waimai_10k IMDB Tweet
P/% R/% F1/% P/% R/% F1/% P/% R/% F1/%
TextCNN 82.00 82.00 82.00 78.50 78.50 78.50 85.07 85.07 85.07
RCNN 87.07 87.37 87.22 85.02 84.55 84.68 92.51 92.64 92.57
GRU 87.40 87.40 87.40 85.30 85.30 85.30 92.78 92.78 92.78
LSTM 87.25 87.25 87.25 84.71 84.71 84.71 92.07 92.07 92.07
Bi-LSTM 87.40 87.40 87.40 85.21 85.21 85.21 93.83 93.83 93.83
ATT-LSTM 87.42 87.42 87.42 84.98 84.98 84.98 92.90 92.90 92.90
PATT-CNN 87.50 87.50 87.50 85.30 85.30 85.30 94.00 94.00 94.00
PATT-INN 88.47 88.47 88.47 86.22 86.22 86.22 94.42 94.42 94.42
Table 3  多模型对比实验结果
模型 卷积层/Involution层参数量
PATT-INN 320
TextCNN 24 768
ATT-CNN 24 768
RCNN 18 496
Table 4  实验2结果
模型 P/% R/% F1/%
PATT-CNN 85.30 85.30 85.30
PATT-INN 86.22 86.22 86.22
Table 5  实验3结果
模型 P/% R/% F1/%
INN 84.58 84.58 84.58
PATT-INN 86.22 86.22 86.22
Table 6  实验4结果
消融实验 P/% R/% F1/%
无变形层 85.58 85.58 85.58
有变形层 86.22 86.22 86.22
Table 7  实验5结果
[1] 王婷, 杨文忠. 文本情感分析方法研究综述[J]. 计算机工程与应用, 2021, 57(12): 11-24.
doi: 10.3778/j.issn.1002-8331.2101-0022
[1] (Wang Ting, Yang Wenzhong. Review of Text Sentiment Analysis Methods[J]. Computer Engineering and Applications, 2021, 57(12): 11-24.)
doi: 10.3778/j.issn.1002-8331.2101-0022
[2] 杨书新, 张楠. 融合情感词典与上下文语言模型的文本情感分析[J]. 计算机应用, 2021, 41(10): 2829-2834.
doi: 10.11772/j.issn.1001-9081.2020121900
[2] (Yang Shuxin, Zhang Nan. Text Sentiment Analysis Based on Sentiment Lexicon and Context Language Model[J]. Journal of Computer Applications, 2021, 41(10): 2829-2834.)
doi: 10.11772/j.issn.1001-9081.2020121900
[3] 欧阳继红, 刘燕辉, 李熙铭, 等. 基于LDA的多粒度主题情感混合模型[J]. 电子学报, 2015, 43(9): 1875-1880.
doi: 10.3969/j.issn.0372-2112.2015.09.029
[3] (Ouyang Jihong, Liu Yanhui, Li Ximing, et al. Multi-Grain Sentiment/Topic Model Based on LDA[J]. Acta Electronica Sinica, 2015, 43(9): 1875-1880.)
doi: 10.3969/j.issn.0372-2112.2015.09.029
[4] Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014:1746-1751.
[5] 杨丽, 吴雨茜, 王俊丽, 等. 循环神经网络研究综述[J]. 计算机应用, 2018, 38(S2): 1-6.
[5] (Yang Li, Wu Yuxi, Wang Junli, et al. Research on Recurrent Neural Network[J]. Journal of Computer Applications, 2018, 38(S2): 1-6.)
[6] 邢长征, 李珊. 文本情感分析的深度学习方法[J]. 计算机应用与软件, 2018, 35(8): 102-106.
[6] (Xing Changzheng, Li Shan. Deep Learning Method for Text Sentiment Analysis[J]. Computer Applications and Software, 2018, 35(8): 102-106.)
[7] Basnet A, Timalsina A K. Improving Nepali News Recommendation Using Classification Based on LSTM Recurrent Neural Networks[C]// Proceedings of the IEEE 3rd International Conference on Computing, Communication and Security. 2018: 138-142.
[8] 滕金保, 孔韦韦, 田乔鑫, 等. 基于CNN和LSTM的多通道注意力机制文本分类模型[J]. 计算机工程与应用, 2021, 57(23): 154-162.
doi: 10.3778/j.issn.1002-8331.2104-0212
[8] (Teng Jinbao, Kong Weiwei, Tian Qiaoxin, et al. Multi-Channel Attention Mechanism Text Classification Model Based on CNN and LSTM[J]. Computer Engineering and Applications, 2021, 57(23): 154-162.)
doi: 10.3778/j.issn.1002-8331.2104-0212
[9] 王友卫, 朱晨, 朱建明, 等. 基于用户兴趣词典和LSTM的个性化情感分类方法[J]. 计算机科学, 2021, 48(S2): 251-257.
[9] (Wang Youwei, Zhu Chen, Zhu Jianming, et al. User Interest Dictionary and LSTM Based Method for Personalized Emotion Classification[J]. Computer Science, 2021, 48(S2): 251-257.)
[10] Lai S W, Xu L H, Liu K, et al. Recurrent Convolutional Neural Networks for Text Classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015: 2267-2273.
[11] 朱烨, 陈世平. 融合卷积神经网络和注意力的评论文本情感分析[J]. 小型微型计算机系统, 2020, 41(3): 551-557.
[11] (Zhu Ye, Chen Shiping. Commentary Text Sentiment Analysis Combining Convolution Neural Network and Attention[J]. Journal of Chinese Computer Systems, 2020, 41(3): 551-557.)
[12] 吴迪, 姜丽婷, 王路路, 等. 结合多头注意力机制的旅游问句分类研究[J]. 计算机工程与应用, 2022, 58(3): 165-171.
doi: 10.3778/j.issn.1002-8331.2008-0151
[12] (Wu Di, Jiang Liting, Wang Lulu, et al. Research on Classification of Tourist Questions Combined with Multi-Head Attention Mechanism[J]. Computer Engineering and Applications, 2022, 58(3): 165-171.)
doi: 10.3778/j.issn.1002-8331.2008-0151
[13] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[14] Mnih V, Heess N, Graves A, et al. Recurrent Models of Visual Attention[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2. 2014: 2204-2212.
[15] Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv: 1409.0473.
[16] 程淑玉, 郭泽颖, 刘威, 等. 融合Attention多粒度句子交互自然语言推理研究[J]. 小型微型计算机系统, 2019, 40(6): 1215-1220.
[16] (Cheng Shuyu, Guo Zeying, Liu Wei, et al. Research on Multi-Granularity Sentence Interaction Natural Language Inference Based on Attention Mechanism[J]. Journal of Chinese Computer Systems, 2019, 40(6): 1215-1220.)
[17] Li D, Hu J, Wang C H, et al. Involution: Inverting the Inherence of Convolution for Visual Recognition[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 12316-12325.
[18] 杨洪刚, 陈洁洁, 徐梦飞. 双线性内卷神经网络用于眼底疾病图像分类[J]. 计算机应用, 2023, 43(1): 259-264.
doi: 10.11772/j.issn.1001-9081.2021111932
[18] (Yang Honggang, Chen Jiejie, Xu Mengfei. Bilinear Involution Neural Network for Image Classification of Fundus Diseases[J]. Journal of Computer Applications, 2023, 43(1): 259-264.)
doi: 10.11772/j.issn.1001-9081.2021111932
[19] 林志洁, 郑秋岚, 梁涌, 等. 基于内卷U-Net的医学图像分割模型[J]. 计算机工程, 2022, 48(8): 180-186.
doi: 10.19678/j.issn.1000-3428.0062023
[19] (Lin Zhijie, Zheng Qiulan, Liang Yong, et al. Medical Image Segmentation Model Based on Involution U-Net[J]. Computer Engineering, 2022, 48(8): 180-186.)
doi: 10.19678/j.issn.1000-3428.0062023
[20] Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv: 1301.3781.
[21] Sharfuddin A A, Tihami M N, Islam M S. A Deep Recurrent Neural Network with BiLSTM Model for Sentiment Classification[C]// Proceedings of the International Conference on Bangla Speech and Language Processing. 2018: 1-4.
[1] 周宁, 钟娜, 靳高雅, 刘斌. 基于混合词嵌入的双通道注意力网络中文文本情感分析*[J]. 数据分析与知识发现, 2023, 7(3): 58-68.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn