Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (6): 103-114    DOI: 10.11925/infotech.2096-3467.2020.1159
Current Issue | Archive | Adv Search |
Sentiment Classification of Image-Text Information with Multi-Layer Semantic Fusion
Xie Hao,Mao Jin(),Li Gang
Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China
Download: PDF (3270 KB)   HTML ( 14
Export: BibTeX | EndNote (RIS)      

[Objective] This paper conducts sentiment analysis of images and text on social media data, aiming to better understand the public's emotions and opinion tendencies. [Methods] To fully explore the correlation and complementarity between images and text, this paper proposes an image-text sentiment classification model in social media based on multi-layer semantic fusion. There are three sub-models in our study: text-image semantic association model, image-text semantic association model, and multimodal semantic deep association fusion model. We used these sub-models to explore the bidirectional and multi-level semantic associations between images and text. Then, we obtained the final classification results using a weighting strategy on the sentiment classification scores generated by the three sub-models. [Results] We examined our model with real image-text data sets and found it achieved the best performance in all evaluation metrics. The accuracy and F1 values of our model were 1.0% and 1.2% better than those of the optimal baseline model. [Limitations] We only evaluated the model’s performance with one single dataset. More research is needed to examine the robustness and scalability of the model. [Conclusions] In the sentiment classification task, the proposed model could more effectively explore the correlation and complementarity between image and text information on social media.

Key wordsImage-Text Fusion      Attention Mechanism      Multi-Modality      Sentiment Classification      Social Media     
Received: 24 November 2020      Published: 10 March 2021
ZTFLH:  G350  
Fund:National Natural Science Foundation of China(71790612);National Natural Science Foundation of China(71921002)
Corresponding Authors: Mao Jin     E-mail:

Cite this article:

Xie Hao,Mao Jin,Li Gang. Sentiment Classification of Image-Text Information with Multi-Layer Semantic Fusion. Data Analysis and Knowledge Discovery, 2021, 5(6): 103-114.

URL:     OR

Examples of Image-Text Data
The Framework of Image-Text Sentiment Classification Based on Multi-layer Semantic Fusion
The Framework of Text-Image Semantic Association Model
The Framework of Image-Text Semantic Association Model
The Framework of Multimodal Semantic Deep Association Fusion Model
模态 参数名称 参数值
文本 GloVe维度 100
1 024
图片 图像尺寸 224×224
ResNet输出层 conv4_block6_out
其他 Dropout
Setting of Important Parameters
方法 算法 Accuracy Recall Precision F1
基准方法 STM 0.768 0.780 0.749 0.764
SIM 0.810 0.772 0.823 0.797
Early Fusion
Late Fusion
本文方法 TISAM 0.812 0.763 0.833 0.797
ITSAM 0.832 0.755 0.880 0.813
Algorithm Performance
α and β
Accuracy at Different Values of α and β
α and β
F1 Value at Different Values of α and β
[1] 喻涛, 罗可. 结合产品特征的评论情感分类模型[J]. 计算机工程与应用, 2019,55(16):108-114.
[1] (Yu Tao, Luo Ke. Commentary Sentiment Classification Model Combining Product Features[J]. Computer Engineering and Applications, 2019,55(16):108-114.)
[2] Pandeya Y R, Lee J. Deep Learning-Based Late Fusion of Multimodal Information for Emotion Classification of Music Video[J]. Multimedia Tools and Applications, 2020,80(38):1-19.
[3] Jia Y X, Chen Z Y, Yu S W. Reader Emotion Classification of News Headlines[C]// Proceedings of 2009 International Conference on Natural Language Processing and Knowledge Engineering. IEEE, 2009. DOI: 10.1109/NLPKE.2009.5313762.
[4] Winster S G, Kumar M N. Automatic Classification of Emotions in News Articles Through Ensemble Decision Tree Classification Techniques[J]. Journal of Ambient Intelligence and Humanized Computing, 2020. DOI: 10.1007/s12652-020-02373-5.
[5] Turney P D. Thumbs up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews[C]// Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. 2002: 417-424.
[6] Nasukawa T, Yi J. Sentiment Analysis: Capturing Favorability Using Natural Language Processing[C]// Proceedings of the 2nd International Conference on Knowledge Capture. 2003: 70-77.
[7] Mullen T, Collier N. Sentiment Analysis Using Support Vector Machines with Diverse Information Sources[C]// Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing. 2004: 412-418.
[8] Xu F, Pan Z, Xia R. E-Commerce Product Review Sentiment Classification Based on a Naïve Bayes Continuous Learning Framework[J]. Information Processing & Management, 2020,57(5):102221.
doi: 10.1016/j.ipm.2020.102221
[9] Xie X, Ge S L, Hu F P, et al. An Improved Algorithm for Sentiment Analysis Based on Maximum Entropy[J]. Soft Computing, 2019,23(2):599-611.
doi: 10.1007/s00500-017-2904-0
[10] Maas A, Daly R E, Pham P T, et al. Learning Word Vectors for Sentiment Analysis[C]// Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 2011: 142-150.
[11] 张璞, 李逍, 刘畅. 基于情感词汇与机器学习的方面级情感分类[J]. 计算机工程与设计, 2020,41(1):128-133.
[11] (Zhang Pu, Li Xiao, Liu Chang. Aspect Level Sentiment Classification Based on Sentiment Words and Machine Learning[J]. Computer Engineering and Design, 2020,41(1):128-133.)
[12] Chen J, Yan S, Wong K C. Verbal Aggression Detection on Twitter Comments: Convolutional Neural Network for Short-Text Sentiment Analysis[J]. Neural Computing and Applications, 2020,32(15):10809-10818.
doi: 10.1007/s00521-018-3442-0
[13] Long F, Zhou K, Ou W H. Sentiment Analysis of Text Based on Bidirectional LSTM with Multi-Head Attention[J]. IEEE Access, 2019,7:141960-141969.
doi: 10.1109/ACCESS.2019.2942614
[14] Wang M, Ning Z H, Li T, et al. Information Geometry Enhanced Fuzzy Deep Belief Networks for Sentiment Classification[J]. International Journal of Machine Learning and Cybernetics, 2019,10(11):3031-3042.
doi: 10.1007/s13042-018-00920-3
[15] Li M G, Li W R, Wang F, et al. Applying BERT to Analyze Investor Sentiment in Stock Market[J]. Neural Computing and Applications, 2020. DOI: 10.1007/s00521-020-05411-7.
[16] Li B, Feng S H, Xiong W H, et al. Scaring or Pleasing: Exploit Emotional Impact of an Image[C]// Proceedings of the 20th ACM International Conference on Multimedia. 2012: 1365-1366.
[17] Vonikakis V, Winkler S. Emotion-Based Sequence of Family Photos[C]// Proceedings of the 20th ACM International Conference on Multimedia. 2012: 1371-1372.
[18] Borth D, Ji R R, Chen T, et al. Large-Scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs[C]// Proceedings of the 21st ACM International Conference on Multimedia. 2013: 223-232.
[19] Xu C, Cetintas S, Lee K C, et al. Visual Sentiment Prediction with Deep Convolutional Neural Networks [OL]. arXiv Preprint, arXiv:1411.5731.
[20] Song K K, Yao T, Ling Q, et al. Boosting Image Sentiment Analysis with Visual Attention[J]. Neurocomputing, 2018,312:218-228.
doi: 10.1016/j.neucom.2018.05.104
[21] Rao T R, Li X X, Zhang H M, et al. Multi-Level Region-Based Convolutional Neural Network for Image Emotion Classification[J]. Neurocomputing, 2019,333:429-439.
doi: 10.1016/j.neucom.2018.12.053
[22] 范涛, 吴鹏, 曹琪. 基于深度学习的多模态融合网民情感识别研究[J]. 信息资源管理学报, 2020,10(1):39-48.
[22] (Fan Tao, Wu Peng, Cao Qi. The Research of Sentiment Recognition of Online Users Based on DNNs Multimodal Fusion[J]. Journal of Information Resources Management, 2020,10(1):39-48.)
[23] Poria S, Chaturvedi I, Cambria E, et al. Convolutional MKL Based Multimodal Emotion Recognition and Sentiment Analysis[C]// Proceedings of 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 2016: 439-448.
[24] Cao D L, Ji R R, Lin D Z, et al. A Cross-Media Public Sentiment Analysis System for Microblog[J]. Multimedia Systems, 2016,22(4):479-486.
doi: 10.1007/s00530-014-0407-8
[25] 缪裕青, 汪俊宏, 刘同来, 等. 图文融合的微博情感分析方法[J]. 计算机工程与设计, 2019,40(4):1099-1105.
[25] (Miao Yuqing, Wang Junhong, Liu Tonglai, et al. Joint Visual-Textual Approach for Microblog Sentiment Analysis[J]. Computer Engineering and Design, 2019,40(4):1099-1105.)
[26] 凌海彬, 缪裕青, 张万桢, 等. 多特征融合的图文微博情感分析[J]. 计算机应用研究, 2020,37(7):1935-1939, 1951.
[26] (Ling Haibin, Miao Yuqing, Zhang Wanzhen, et al. Multimedia Sentiment Analysis on Microblog Based on Multi-Feature Fusion[J]. Application Research of Computers, 2020,37(7):1935-1939, 1951.)
[27] Zhao Z Y, Zhu H Y, Xue Z H, et al. An Image-Text Consistency Driven Multimodal Sentiment Analysis Approach for Social Media[J]. Information Processing & Management, 2019,56(6):102097.
doi: 10.1016/j.ipm.2019.102097
[28] Truong Q T, Lauw H W. VistaNet: Visual Aspect Attention Network for Multimodal Sentiment Analysis[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019: 305-312.
[29] Huang F R, Zhang X M, Zhao Z H, et al. Image-Text Sentiment Analysis via Deep Multimodal Attentive Fusion[J]. Knowledge-Based Systems, 2019,167:26-37.
doi: 10.1016/j.knosys.2019.01.019
[30] You Q Z, Luo J B, Jin H L, et al. Cross-Modality Consistent Regression for Joint Visual-Textual Sentiment Analysis of Social Multimedia[C]// Proceedings of the 9th ACM International Conference on Web Search and Data Mining. 2016: 13-22.
[31] You Q Z, Jin H L, Luo J B. Visual Sentiment Analysis by Attending on Local Image Regions[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017: 231-237.
[32] You Q Z, Cao L L, Jin H L, et al. Robust Visual-Textual Sentiment Analysis: When Attention Meets Tree-Structured Recursive Neural Networks[C]// Proceedings of the 24th ACM International Conference on Multimedia. 2016: 1008-1017.
[33] Zadeh A, Chen M H, Poria S, et al. Tensor Fusion Network for Multimodal Sentiment Analysis[OL]. arXiv Preprint, arXiv:1707.07250.
[34] Ramos J. Using TF-IDF to Determine Word Relevance in Document Queries[C]// Proceedings of the 1st International Conference on Machine Learning. 2003,242:133-142.
[35] Ojala T, Pietikainen M, Maenpaa T. Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002,24(7):971-987.
doi: 10.1109/TPAMI.2002.1017623
[36] Lowe D G. Distinctive Image Features from Scale-Invariant Keypoints[J]. International Journal of Computer Vision, 2004,60(2):91-110.
doi: 10.1023/B:VISI.0000029664.99615.94
[37] Dalal N, Triggs B. Histograms of Oriented Gradients for Human Detection[C]// Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). IEEE, 2005: 886-893.
[38] Bay H, Tuytelaars T, van Gool L. SURF: Speeded up Robust Features[C]// Proceedings of the 9th European Conference on Computer Vision. 2006: 404-417.
[39] Baltrušaitis T, Ahuja C, Morency L P. Multimodal Machine Learning: A Survey and Taxonomy[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018,41(2):423-443.
doi: 10.1109/TPAMI.2018.2798607
[1] Yang Hanxun, Zhou Dequn, Ma Jing, Luo Yongcong. Detecting Rumors with Uncertain Loss and Task-level Attention Mechanism[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
[2] Yin Pengbo,Pan Weimin,Zhang Haijun,Chen Degang. Identifying Clickbait with BERT-BiGA Model[J]. 数据分析与知识发现, 2021, 5(6): 126-134.
[3] Ma Yingxue,Zhao Jichang. Patterns and Evolution of Public Opinion on Weibo During Natural Disasters: Case Study of Typhoons and Rainstorms[J]. 数据分析与知识发现, 2021, 5(6): 66-79.
[4] Han Pu,Zhang Zhanpeng,Zhang Mingtao,Gu Liang. Normalizing Chinese Disease Names with Multi-feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 83-94.
[5] Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[6] Duan Jianyong,Wei Xiaopeng,Wang Hao. A Multi-Perspective Co-Matching Model for Machine Reading Comprehension[J]. 数据分析与知识发现, 2021, 5(4): 134-141.
[7] Wang Yuzhu,Xie Jun,Chen Bo,Xu Xinying. Multi-modal Sentiment Analysis Based on Cross-modal Context-aware Attention[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[8] Jiang Cuiqing,Wang Xiangxiang,Wang Zhao. Forecasting Car Sales Based on Consumer Attention[J]. 数据分析与知识发现, 2021, 5(1): 128-139.
[9] Huang Lu,Zhou Enguo,Li Daifeng. Text Representation Learning Model Based on Attention Mechanism with Task-specific Information[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
[10] Yin Haoran,Cao Jinxuan,Cao Luzhe,Wang Guodong. Identifying Emergency Elements Based on BiGRU-AM Model with Extended Semantic Dimension[J]. 数据分析与知识发现, 2020, 4(9): 91-99.
[11] Li Gang, Guan Weidong, Ma Yaxue, Mao Jin. Predicting Social Media Visibility of Scholarly Articles[J]. 数据分析与知识发现, 2020, 4(8): 63-74.
[12] Liu Qian, Li Chenliang. A Survey of Topic Evolution on Social Media[J]. 数据分析与知识发现, 2020, 4(8): 1-14.
[13] Shi Lei,Wang Yi,Cheng Ying,Wei Ruibin. Review of Attention Mechanism in Natural Language Processing[J]. 数据分析与知识发现, 2020, 4(5): 1-14.
[14] Xue Fuliang,Liu Lifang. Fine-Grained Sentiment Analysis with CRF and ATAE-LSTM[J]. 数据分析与知识发现, 2020, 4(2/3): 207-213.
[15] Qi Ruihua,Jian Yue,Guo Xu,Guan Jinghua,Yang Mingxin. Sentiment Analysis of Cross-Domain Product Reviews Based on Feature Fusion and Attention Mechanism[J]. 数据分析与知识发现, 2020, 4(12): 85-94.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938