1School of Management, Hefei University of Technology, Hefei 230009, China 2Key Laboratory of Process Optimization & Intelligent Decision-Making, Ministry of Education, Hefei University of Technology, Hefei 230009, China
[Objective] This paper proposes a sarcasm detection model based on affective dependency graph convolutional neural network-modality fusion. It tries to comprehensively improve multimodal sarcasm detection studies with sentiment information and syntactic dependencies of texts. [Methods] The new model enhances text modalities’ sentiment and syntactic information by utilizing sentiment graphs and syntactic dependency graphs. It uses graph convolutional neural networks to obtain text information with rich sentiment semantics and then fuses multimodal features by modal fusion. Finally, the model uses a self-attention mechanism to filter redundant information and perform sarcasm detection based on the fused information. [Results] The new model’s accuracy reached 85.85%, which is 3.46%, 2.25%, 1.83%, and 0.95% higher than the baseline models HFM, Res-BERT, D&R Net, and IIMI-MMSD, respectively. The F1 value reached 84.80%, 1.44% higher than the baseline models. [Limitations] More research is needed to validate the generalization and robustness of the model on more datasets. [Conclusions] The proposed model can thoroughly examine the sentiment and syntactic dependencies of the text and effectively detect multimodal sarcasm.
余本功, 季晓晗. 基于ADGCN-MFM的多模态讽刺检测研究*[J]. 数据分析与知识发现, 2023, 7(10): 85-94.
Yu Bengong, Ji Xiaohan. Detecting Multimodal Sarcasm Based on ADGCN-MFM. Data Analysis and Knowledge Discovery, 2023, 7(10): 85-94.
(Luo Guanzhu, Zhao Yanyan, Qin Bing, et al. Social Media-Oriented Sarcasm Detection[J]. Intelligent Computer and Applications, 2020, 10(2): 301-307.)
[2]
Potamias R A, Siolas G, Stafylopatis A G. A Transformer-Based Approach to Irony and Sarcasm Detection[J]. Neural Computing and Applications, 2020, 32(23): 17309-17320.
doi: 10.1007/s00521-020-05102-3
[3]
Cai Y T, Cai H Y, Wan X J. Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 2506-2515.
[4]
Sangwan S, Akhtar M S, Behera P, et al. I Didn’t Mean What I Wrote! Exploring Multimodality for Sarcasm Detection[C]// Proceedings of 2020 International Joint Conference on Neural Networks. 2020: 1-8.
[5]
Wang X Y, Sun X W, Yang T, et al. Building a Bridge: A Method for Image-Text Sarcasm Detection Without Pretraining on Image-Text Data[C]// Proceedings of the 1st International Workshop on Natural Language Processing Beyond Text. 2020: 19-29.
(Zhong Jiawa, Liu Wei, Wang Sili, et al. Review of Methods and Applications of Text Sentiment Analysis[J]. Data Analysis and Knowledge Discovery, 2021, 5(6): 1-13.)
[7]
Abdu S A, Yousef A H, Salem A. Multimodal Video Sentiment Analysis Using Deep Learning Approaches, a Survey[J]. Information Fusion, 2021, 76(C): 204-226.
[8]
Du Y P, Liu Y, Peng Z, et al. Gated Attention Fusion Network for Multimodal Sentiment Classification[J]. Knowledge-Based Systems, 2022, 240: 108107.
doi: 10.1016/j.knosys.2021.108107
(Yuan Jingling, Ding Yuanyuan, Sheng Deming, et al. Image-Text Sentiment Analysis Model Based on Visual Aspect Attention[J]. Computer Science, 2022, 49(1): 219-224.)
doi: 10.11896/jsjkx.201000074
[10]
Wang K, Shen W Z, Yang Y Y, et al. Relational Graph Attention Network for Aspect-Based Sentiment Analysis[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 3229-3238.
[11]
Xue X J, Zhang C X, Niu Z D, et al. Multi-Level Attention Map Network for Multimodal Sentiment Analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(5): 5105-5118.
[12]
Yang X C, Feng S, Zhang Y F, et al. Multimodal Sentiment Detection Based on Multi-channel Graph Neural Networks[C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1:Long Papers). 2021: 328-339.
[13]
Pan H L, Lin Z, Fu P, et al. Modeling Intra and Inter-modality Incongruity for Multi-Modal Sarcasm Detection[C]// Findings of the Association for Computational Linguistics:EMNLP 2020. 2020: 1383-1392.
[14]
Xu N, Zeng Z X, Mao W J. Reasoning with Multimodal Sarcastic Tweets via Modeling Cross-Modality Contrast and Semantic Association[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 3777-3786.
[15]
Gupta S, Shah A, Shah M, et al. FiLMing Multimodal Sarcasm Detection with Attention[OL]. arXiv Preprint, arXiv: 2110.00416.
(Zhang Jidong, Jiang Liping. Research on Irony Recognition of Travel Reviews Based on Multi-modal Deep Learning[J]. Information Studies: Theory & Application, 2022, 45(7): 158-164.)
doi: 10.16353/j.cnki.1000-7490.2022.07.022
[17]
Dosovitskiy A, Beyer L, Kolesnikov A, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale[OL]. arXiv Preprint, arXiv: 2010. 11929.
[18]
Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[19]
Voita E, Talbot D, Moiseev F, et al. Analyzing Multi-head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 5797-5808.
[20]
Gaudart J, Giusiano B, Huiart L. Comparison of the Performance of Multi-layer Perceptron and Linear Regression for Epidemiological Data[J]. Computational Statistics & Data Analysis, 2004, 44(4): 547-570.
doi: 10.1016/S0167-9473(02)00257-8
[21]
Ba J L, Kiros J R, Hinton G E. Layer Normalization[OL]. arXiv Preprint, arXiv: 1607.06450.
[22]
Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1 (Long and Short Papers). 2019: 4171-4186.
[23]
Lou C W, Liang B, Gui L, et al. Affective Dependency Graph for Sarcasm Detection[C]// Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2021: 1844-1849.
[24]
Cambria E, Li Y, Xing F Z, et al. SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis[C]// Proceedings of the 29th ACM International Conference on Information and Knowledge Management. 2020: 105-114.
(Luo Yaoru, Li Zhi. Word Sense Disambiguation in Biomedical Text Based on Bi-LSTM[J]. Software Guide, 2019, 18(4): 57-59.)
[26]
He K M, Zhang X Y, Ren S Q, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016: 770-778.
[27]
Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[28]
Xiong T, Zhang P R, Zhu H B, et al. Sarcasm Detection with Self-matching Networks and Low-rank Bilinear Pooling[C]// Proceedings of the World Wide Web Conference. 2019: 2115-2124.
[29]
Tay Y, Luu A T, Hui S C, et al. Reasoning with Sarcasm by Reading In-between[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2018: 1010-1020.
[30]
Liang B, Lou C W, Li X, et al. Multi-Modal Sarcasm Detection with Interactive In-Modal and Cross-Modal Graphs[C]// Proceedings of the 29th ACM International Conference on Multimedia. 2021: 4707-4715.