Detecting Multimodal Sarcasm Based on ADGCN-MFM

doi:10.11925/infotech.2096-3467.2022.0987

Data Analysis and Knowledge Discovery

2023, Vol. 7

Issue (10): 85-94 DOI: 10.11925/infotech.2096-3467.2022.0987

Current Issue | Archive | Adv Search

Detecting Multimodal Sarcasm Based on ADGCN-MFM

Yu Bengong^1,²(

),Ji Xiaohan¹

¹School of Management, Hefei University of Technology, Hefei 230009, China
²Key Laboratory of Process Optimization & Intelligent Decision-Making, Ministry of Education, Hefei University of Technology, Hefei 230009, China

Download: PDF (2346 KB) HTML ( 12 )
Export: BibTeX | EndNote (RIS)

Abstract

[Objective] This paper proposes a sarcasm detection model based on affective dependency graph convolutional neural network-modality fusion. It tries to comprehensively improve multimodal sarcasm detection studies with sentiment information and syntactic dependencies of texts. [Methods] The new model enhances text modalities’ sentiment and syntactic information by utilizing sentiment graphs and syntactic dependency graphs. It uses graph convolutional neural networks to obtain text information with rich sentiment semantics and then fuses multimodal features by modal fusion. Finally, the model uses a self-attention mechanism to filter redundant information and perform sarcasm detection based on the fused information. [Results] The new model’s accuracy reached 85.85%, which is 3.46%, 2.25%, 1.83%, and 0.95% higher than the baseline models HFM, Res-BERT, D&R Net, and IIMI-MMSD, respectively. The F1 value reached 84.80%, 1.44% higher than the baseline models. [Limitations] More research is needed to validate the generalization and robustness of the model on more datasets. [Conclusions] The proposed model can thoroughly examine the sentiment and syntactic dependencies of the text and effectively detect multimodal sarcasm.

Key words： Multimodality Sarcasm Detection Sentiment-Dependency Graph Convolutional Neural Network Modality Fusion

Received: 20 September 2022 Published: 21 March 2023

ZTFLH:	TP393
	G250

Fund:National Natural Science Foundation of China(72071061)

Corresponding Authors: Yu Bengong，ORCID：0000-0003-4170-2335，E-mail： bgyu19@163.com。

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Bengong Yu
	Xiaohan Ji

Cite this article:

Yu Bengong, Ji Xiaohan. Detecting Multimodal Sarcasm Based on ADGCN-MFM. Data Analysis and Knowledge Discovery, 2023, 7(10): 85-94.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2022.0987 OR https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2023/V7/I10/85

Structure of ADGCN-MFM Model

Architecture of ViT Model

Extraction and Processing of Image Attributes

Statistics of Datasets

Experimental Parameter Setting

Comparison Results

Results of Ablation Experiments

The Effect of GCN Layers on Model Performance

Examples of Cases

[1]	罗观柱, 赵妍妍, 秦兵, 等. 面向社交媒体的反讽识别[J]. 智能计算机与应用, 2020, 10(2): 301-307.
[1]	(Luo Guanzhu, Zhao Yanyan, Qin Bing, et al. Social Media-Oriented Sarcasm Detection[J]. Intelligent Computer and Applications, 2020, 10(2): 301-307.)
[2]	Potamias R A, Siolas G, Stafylopatis A G. A Transformer-Based Approach to Irony and Sarcasm Detection[J]. Neural Computing and Applications, 2020, 32(23): 17309-17320. doi: 10.1007/s00521-020-05102-3
[3]	Cai Y T, Cai H Y, Wan X J. Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 2506-2515.
[4]	Sangwan S, Akhtar M S, Behera P, et al. I Didn’t Mean What I Wrote! Exploring Multimodality for Sarcasm Detection[C]// Proceedings of 2020 International Joint Conference on Neural Networks. 2020: 1-8.
[5]	Wang X Y, Sun X W, Yang T, et al. Building a Bridge: A Method for Image-Text Sarcasm Detection Without Pretraining on Image-Text Data[C]// Proceedings of the 1st International Workshop on Natural Language Processing Beyond Text. 2020: 19-29.
[6]	钟佳娃, 刘巍, 王思丽, 等. 文本情感分析方法及应用综述[J]. 数据分析与知识发现, 2021, 5(6): 1-13.
[6]	(Zhong Jiawa, Liu Wei, Wang Sili, et al. Review of Methods and Applications of Text Sentiment Analysis[J]. Data Analysis and Knowledge Discovery, 2021, 5(6): 1-13.)
[7]	Abdu S A, Yousef A H, Salem A. Multimodal Video Sentiment Analysis Using Deep Learning Approaches, a Survey[J]. Information Fusion, 2021, 76(C): 204-226.
[8]	Du Y P, Liu Y, Peng Z, et al. Gated Attention Fusion Network for Multimodal Sentiment Classification[J]. Knowledge-Based Systems, 2022, 240: 108107. doi: 10.1016/j.knosys.2021.108107
[9]	袁景凌, 丁远远, 盛德明, 等. 基于视觉方面注意力的图像文本情感分析模型[J]. 计算机科学, 2022, 49(1): 219-224. doi: 10.11896/jsjkx.201000074
[9]	(Yuan Jingling, Ding Yuanyuan, Sheng Deming, et al. Image-Text Sentiment Analysis Model Based on Visual Aspect Attention[J]. Computer Science, 2022, 49(1): 219-224.) doi: 10.11896/jsjkx.201000074
[10]	Wang K, Shen W Z, Yang Y Y, et al. Relational Graph Attention Network for Aspect-Based Sentiment Analysis[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 3229-3238.
[11]	Xue X J, Zhang C X, Niu Z D, et al. Multi-Level Attention Map Network for Multimodal Sentiment Analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(5): 5105-5118.
[12]	Yang X C, Feng S, Zhang Y F, et al. Multimodal Sentiment Detection Based on Multi-channel Graph Neural Networks[C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1:Long Papers). 2021: 328-339.
[13]	Pan H L, Lin Z, Fu P, et al. Modeling Intra and Inter-modality Incongruity for Multi-Modal Sarcasm Detection[C]// Findings of the Association for Computational Linguistics:EMNLP 2020. 2020: 1383-1392.
[14]	Xu N, Zeng Z X, Mao W J. Reasoning with Multimodal Sarcastic Tweets via Modeling Cross-Modality Contrast and Semantic Association[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 3777-3786.
[15]	Gupta S, Shah A, Shah M, et al. FiLMing Multimodal Sarcasm Detection with Attention[OL]. arXiv Preprint, arXiv: 2110.00416.
[16]	张继东, 蒋丽萍. 基于多模态深度学习的旅游评论反讽识别研究[J]. 情报理论与实践, 2022, 45(7): 158-164. doi: 10.16353/j.cnki.1000-7490.2022.07.022
[16]	(Zhang Jidong, Jiang Liping. Research on Irony Recognition of Travel Reviews Based on Multi-modal Deep Learning[J]. Information Studies: Theory & Application, 2022, 45(7): 158-164.) doi: 10.16353/j.cnki.1000-7490.2022.07.022
[17]	Dosovitskiy A, Beyer L, Kolesnikov A, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale[OL]. arXiv Preprint, arXiv: 2010. 11929.
[18]	Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[19]	Voita E, Talbot D, Moiseev F, et al. Analyzing Multi-head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 5797-5808.
[20]	Gaudart J, Giusiano B, Huiart L. Comparison of the Performance of Multi-layer Perceptron and Linear Regression for Epidemiological Data[J]. Computational Statistics & Data Analysis, 2004, 44(4): 547-570. doi: 10.1016/S0167-9473(02)00257-8
[21]	Ba J L, Kiros J R, Hinton G E. Layer Normalization[OL]. arXiv Preprint, arXiv: 1607.06450.
[22]	Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1 (Long and Short Papers). 2019: 4171-4186.
[23]	Lou C W, Liang B, Gui L, et al. Affective Dependency Graph for Sarcasm Detection[C]// Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2021: 1844-1849.
[24]	Cambria E, Li Y, Xing F Z, et al. SenticNet 6: Ensemble Application of Symbolic and Subsymbolic AI for Sentiment Analysis[C]// Proceedings of the 29th ACM International Conference on Information and Knowledge Management. 2020: 105-114.
[25]	罗曜儒, 李智. 基于Bi-LSTM的生物医学文本语义消歧研究[J]. 软件导刊, 2019, 18(4): 57-59.
[25]	(Luo Yaoru, Li Zhi. Word Sense Disambiguation in Biomedical Text Based on Bi-LSTM[J]. Software Guide, 2019, 18(4): 57-59.)
[26]	He K M, Zhang X Y, Ren S Q, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016: 770-778.
[27]	Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[28]	Xiong T, Zhang P R, Zhu H B, et al. Sarcasm Detection with Self-matching Networks and Low-rank Bilinear Pooling[C]// Proceedings of the World Wide Web Conference. 2019: 2115-2124.
[29]	Tay Y, Luu A T, Hui S C, et al. Reasoning with Sarcasm by Reading In-between[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers). 2018: 1010-1020.
[30]	Liang B, Lou C W, Li X, et al. Multi-Modal Sarcasm Detection with Interactive In-Modal and Cross-Modal Graphs[C]// Proceedings of the 29th ACM International Conference on Multimedia. 2021: 4707-4715.

[1]	Zhang Yu, Zhang Haijun, Liu Yaqing, Liang Kejin, Wang Yueyang. Multimodal Sentiment Analysis Based on Bidirectional Mask Attention Mechanism[J]. 数据分析与知识发现, 2023, 7(4): 46-55.
[2]	Chen Yuanyuan, Ma Jing. Detecting Multimodal Sarcasm Based on SC-Attention Mechanism[J]. 数据分析与知识发现, 2022, 6(9): 40-51.
[3]	Guo Fanrong, Huang Xiaoxi, Wang Rongbo, Chen Zhiqun, Hu Chuang, Xie Yimin, Si Boyu. Identifying Metaphor with Transformer and Graph Convolutional Network[J]. 数据分析与知识发现, 2022, 6(4): 120-129.
[4]	Liu Yang, Ma Lili, Zhang Wen, Hu Zhongyi, Wu Jiang. Detecting Sarcasm from Travel Reviews Based on Cross-Modal Deep Learning[J]. 数据分析与知识发现, 2022, 6(12): 23-31.

Viewed

Full text

Abstract

Cited

Shared

Discussed