|
|
Detecting Multimodal Sarcasm Based on SC-Attention Mechanism |
Chen Yuanyuan,Ma Jing() |
College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China |
|
|
Abstract [Objective] This paper designs an SC-Attention fusion mechanism,aiming to improve the low prediction accuracy and difficult fusion of multimodal features in the existing detection models for multimodal sarcasm. [Methods] First, we used the CLIP and RoBERTa models to extract features from pictures, picture attributes, and texts. Then, we combined the SC-Attention mechanism with SENet’s attention mechanism to establish the Co-Attention mechanism and fuse multi-modal features. Third, we re-allocated attention feature weights by the original modals. Finally, we input features to the full connection layers to detect sarcasm. [Results] The accuracy and F1 of the proposed model reached 93.71% and 91.68%, which were 10.27 and 11.5 percentage point higher than the existing ones. [Limitations] We need to examine our model with more data sets. [Conclusions] The proposed model reduces information redundancy and feature loss, which effectively improves the accuracy of multimodal sarcasm detection.
|
Received: 01 December 2021
Published: 26 October 2022
|
|
Fund:National Natural Science Foundation of China(72174086);Special Forward-looking Development Strategy Research Project of the Fundamental Research Funds for the of Central Universities(NW2020001) |
Corresponding Authors:
Ma Jing,ORCID: 0000-0001-8472-2518
E-mail: majing5525@126.com
|
[1] |
Cai Y T, Cai H Y, Wan X J. Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 2506-2515.
|
[2] |
Joshi A, Tripathi V, Patel K, et al. Are Word Embedding-Based Features Useful for Sarcasm Detection?[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016:1006-1011.
|
[3] |
Poria S, Cambria E, Hazarika D, et al. A Deeper Look into Sarcastic Tweets Using Deep Convolutional Neural Networks[OL]. arXiv Preprint, arXiv:1610.08815.
|
[4] |
Potamias R A, Siolas G, Stafylopatis A G. A Transformer-Based Approach to Irony And sarcasm Detection[J]. Neural Computing and Applications, 2020, 32(23): 17309-17320.
doi: 10.1007/s00521-020-05102-3
|
[5] |
何俊, 张彩庆, 李小珍, 等. 面向深度学习的多模态融合技术研究综述[J]. 计算机工程, 2020, 46(5): 1-11.
|
[5] |
( He Jun, Zhang Caiqing, Li Xiaozhen, et al. Survey of Research on Multimodal Fusion Technology for Deep Learning[J]. Computer Engineering, 2020, 46(5): 1-11.)
|
[6] |
Zadeh A, Chen M, Poria S, et al. Tensor Fusion Network for Multimodal Sentiment Analysis[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017: 1103-1114.
|
[7] |
Bedi M, Kumar S, Akhtar M S, et al. Multi-Modal Sarcasm Detection and Humor Classification in Code-Mixed Conversations[J]. IEEE Transactions on Affective Computing, DOI: 10.1109/TAFFC.2021.3083522.
doi: 10.1109/TAFFC.2021.3083522
|
[8] |
Handoyo A T, Suhartono D. Sarcasm Detection in Twitter: Performance Impact While Using Data Augmentation: Word Embeddings[OL]. arXiv Preprint, arXiv: 2108.09924.
|
[9] |
Swami S, Khandelwal A, Singh V, et al. A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection[OL]. arXiv Preprint, arXiv:1805.11869.
|
[10] |
Castro S, Hazarika D, Pérez-Rosas V, et al. Towards Multimodal Sarcasm Detection (An_Obviously_Perfect Paper)[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 4619-4629.
|
[11] |
Bharti S K, Babu K S, Jena S K. Harnessing Online News for Sarcasm Detection in Hindi Tweets[C]// Proceedings of International Conference on Pattern Recognition and Machine Intelligence. 2017: 679-686.
|
[12] |
Radford A, Kim J W, Hallacy C, et al. Learning Transferable Visual Models from Natural Language Supervision[C]// Proceedings of the 38th International Conference on Machine Learning. 2021: 8748-8763.
|
[13] |
Liu Z, Lin W, Shi Y, et al. A Robustly Optimized BERT Pre-training Approach with Post-training[C]// Proceedings of Chinese Computational Linguistics:20th China National Conference. 2021: 471-484.
|
[14] |
孟祥瑞, 杨文忠, 王婷. 基于图文融合的情感分析研究综述[J]. 计算机应用, 2021, 41(2): 307-317.
doi: 10.11772/j.issn.1001-9081.2020060923
|
[14] |
( Meng Xiangrui, Yang Wenzhong, Wang Ting. Survey of Sentiment Analysis Based on Image and Text Fusion[J]. Journal of Computer Applications, 2021, 41(2): 307-317.)
doi: 10.11772/j.issn.1001-9081.2020060923
|
[15] |
Lu J S, Batra D, Yang J W, et al. Hierarchical Question-Image Co-Attention for Visual Question Answering[OL]. arXiv Preprint, arXiv: 1606.00061.
|
[16] |
Hu J, Shen L, Sun G. Squeeze-and-EXCITATION Networks[C]// Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018: 7132-7141.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|