数据分析与知识发现  2024, Vol. 8 Issue (5): 91-101
1北京信息科技大学网络文化与数字传播北京市重点实验室 北京 100101
2青海师范大学藏语智能信息处理及应用国家重点实验室 西宁 810008
Multimodal Sentiment Analysis Model Integrating Multi-features and Attention Mechanism
Lyu Xueqiang1,Tian Chi1,Zhang Le1(),Du Yifan1,Zhang Xu1,Cai Zangtai2
1Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing 100101, China
2The State Key Laboratory of Tibetan Intelligent Information Processing and Application, Qinghai Normal University, Xining 810008, China
【目的】 针对当前多模态情感分析中多模态特征提取不充分,模态内部信息和模态间交互信息结合不充分的问题,提出一种融合多特征和注意力机制的多模态情感分析模型。【方法】 在多模态特征提取方面,增加视频模态中人物的肢体动作、性别和年龄特征;对于文本模态,融合基于BERT的字粒度语义向量和融合义原信息的词粒度语义向量,丰富了多模态数据的低层特征。利用自注意力机制和跨模态注意力机制以实现模态内部信息和模态间信息的充分结合。将各模态特征进行拼接,通过软注意力机制为各模态特征分配注意力权重,通过全连接层输出最终的情感分类结果。【结果】 在公开数据集CH-SIMS和本文构建的热点舆情评论视频数据集HPOC上与Self-MM模型对比,实验结果表明,本文模型在CH-SIMS数据集上的二分类准确率、三分类准确率和F1值分别提升1.83、1.74和0.69个百分点,在HPOC数据集上分别提升1.03、0.94和0.79个百分点。【局限】 视频中人物所处的场景可能不断变化,不同的场景可能蕴含不同的情感信息,模型未考虑融合人物所处的场景信息。【结论】 本文模型丰富了多模态数据的低层特征,充分结合模态内部信息和模态间信息,能够有效提升情感分析的效果。

关键词 多特征多模态情感分析注意力机制    

[Objective] This paper proposes a multimodal sentiment analysis model integrating multiple features and attention mechanisms. It addresses the insufficient extraction of multimodal features and inadequate interaction of intra-modal and inter-modal information in existing models. [Methods] In multimodal feature extraction, we enhanced the features of body movements, gender, and age of individuals in the video modality. For the text modality, we integrated BERT-based character-level and word-level semantic vectors. Therefore, we enriched the low-level features of multimodal data. We also utilized self-attention and cross-modal attention mechanisms to integrate intra-modal and inter-modal information. We concatenated the modal features and employed a soft attention mechanism to allocate attention weight to each feature. Finally, we generated the sentiment classification results through fully connected layers. [Results] We examined the proposed model on the public dataset (CH-SIMS) and the Hot Public Opinion Comments Videos (HPOC) dataset constructed in this paper. Compared with the Self-MM model, our model improved the binary classification accuracy, tri-class classification accuracy, and F1 value by 1.83%, 1.74%, and 0.69% on the CH-SIMS dataset, and 1.03%, 0.94%, and 0.79% on the HPOC dataset. [Limitations] The person’s scene in the video may change constantly, and different scenes may contain different emotional information. Our model does not integrate the scene information of the person. [Conclusions] The proposed model enriches the low-level features of multimodal data and improves the effectiveness of sentimental analysis.

Key wordsMulti-features    Multi-modal    Sentiment Analysis    Attention Mechanism
收稿日期: 2023-01-11      出版日期: 2024-05-27
ZTFLH:  TP391  
通讯作者: 张乐,ORCID:0000-0002-9620-511X,E-mail:。   
吕学强, 田驰, 张乐, 杜一凡, 张旭, 才藏太. 融合多特征和注意力机制的多模态情感分析模型*[J]. 数据分析与知识发现, 2024, 8(5): 91-101.
Lyu Xueqiang, Tian Chi, Zhang Le, Du Yifan, Zhang Xu, Cai Zangtai. Multimodal Sentiment Analysis Model Integrating Multi-features and Attention Mechanism. Data Analysis and Knowledge Discovery, 2024, 8(5): 91-101.
Fig.1  MFAM架构图
Fig.2  跨模态注意力模块结构
训练集 验证集 测试集 训练集 验证集 测试集
话语数 1 368 456 457 350 119 119
积极情感 419 139 140 125 33 47
中性情感 207 69 69 87 32 37
消极情感 742 248 248 138 54 35
Table 1  数据集基本信息
实验环境 配置
操作系统 Linux
CPU Intel(R) Xeon(R) Gold 5118 CPU @2.30GHz
GPU Tesla V100
Python 3.8.13
PyTorch 1.12.1
CUDA 11.4
Table 2  实验环境信息
参数 参数值 参数 参数值
跨模态注意力维度 50 Learning_rate 0.002
跨模态注意力头数 10 Dropout 0.3
优化器 Adam Early_stop 8
迭代次数 20 Batch_size 16
Table 3  实验参数设置
模型 Acc-2/% Acc-3/% F1-Score/% MAE
EF-LSTM 69.37 54.27 56.82 0.590
TFN 78.38 65.12 78.62 0.432
MFN 77.90 65.73 77.88 0.435
MulT 78.56 64.77 79.66 0.453
MISA 79.43 64.55 79.70 0.428
Self-MM 80.04 65.47 80.44 0.425
MFAM 81.87 67.21 81.13 0.416
Table 4  不同模型在CH-SIMS数据集上的实验结果
模型 Acc-2/% Acc-3/% F1-Score/% MAE
EF-LSTM 63.26 49.50 50.17 0.632
TFN 72.45 57.14 71.85 0.593
MFN 73.38 57.65 72.04 0.589
MulT 73.32 57.43 71.90 0.591
MISA 74.03 58.21 73.26 0.578
Self-MM 74.37 58.52 73.73 0.564
MFAM 75.40 59.46 74.52 0.560
Table 5  不同模型在HPOC数据集上的实验结果
序号 模型 Acc-2/% Acc-3/% F1-Score/% MAE
1 L-A 80.41 66.47 79.71 0.435
2 L-V 79.36 65.72 78.84 0.447
3 w/o Pose 80.92 66.64 79.93 0.435
4 w/o Gender 81.13 66.73 80.23 0.426
5 w/o Age 81.27 66.85 80.45 0.423
6 w/o P_G_A 80.35 66.39 79.48 0.444
7 w/o Sememe 81.05 66.70 79.95 0.432
8 w/o Cross-attention 80.18 65.96 79.26 0.446
9 w/o Soft-attention 81.35 67.03 80.56 0.421
10 MFAM 81.87 67.21 81.13 0.416
Table 6  CH-SIMS数据集消融实验结果
