Please wait a minute...
Data Analysis and Knowledge Discovery  2023, Vol. 7 Issue (4): 114-128    DOI: 10.11925/infotech.2096-3467.2022.0479
Current Issue | Archive | Adv Search |
A Deep Reinforcement Learning Recommendation Model with Multi-modal Features
Pan Huali1,Xie Jun1(),Gao Jing1,Xu Xinying2,Wang Changzheng3
1College of Information and Computer, Taiyuan University of Technology, Jinzhong 030600,China
2College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024,China
3Shanxi Tongfang Knowledge Network Digital Publishing Technology Co., Ltd., Taiyuan 030000, China
Download: PDF (1821 KB)   HTML ( 26
Export: BibTeX | EndNote (RIS)      

[Objective] This paper addresses data sparsity and dynamic changes in user interests with multimodal feature fusion and deep reinforcement learning. [Methods] First, we used a pre-trained model and attention mechanism to achieve intra-modal representation and fusion of three modalities. Then, we created a model for user-item interactions. Finally, we utilized the deep reinforcement learning algorithm to capture user interest drift and long and short-term rewards in real time to achieve personalized recommendations. [Results] Compared with the highest value in the baseline models, the proposed model improved precision@5 by 11.8%, 16.5%, 11.4%, and NDCG@5 by 5.3%, 8.0%, 6.4%, on the MovieLens-1M, MovieLens-100K, and Douban datasets, respectively. [Limitations] The user interaction history in the Douban dataset is relatively small, and the proposed model cannot learn more accurate user preferences during training. Compared with the experiments on the MovieLens dataset, we received limited recommendation results. [Conclusions] The proposed model integrates multimodal information to reconstruct the state representation network of deep reinforcement learning, improving the recommendation effect.

Key wordsRecommendation      Deep Reinforcement Learning      Multimodal Feature Fusion      User-Recommender System Interaction     
Received: 12 May 2022      Published: 09 November 2022
ZTFLH:  TP391  
Fund:Open Project Program of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University(VRLAB2022C11);Research Project of Shanxi Scholarship Council of China(2020-040);Shanxi Province Science and Technology Cooperation Exchange Special Project(202104041101030)
Corresponding Authors: Xie Jun ,ORCID:0000-0003-0955-9970,   

Cite this article:

Pan Huali, Xie Jun, Gao Jing, Xu Xinying, Wang Changzheng. A Deep Reinforcement Learning Recommendation Model with Multi-modal Features. Data Analysis and Knowledge Discovery, 2023, 7(4): 114-128.

URL:     OR

Deep Reinforcement Learning Recommendation Model Fused with Multi-modal Features(M2DR-RM)
The Interaction Process Between Users and the Recommender System
Model Structure of PV-DM
Model Structure of VGG-16
Multimodal Representation and Interactive Fusion Network for Movies
User-Item Interaction Modeling
项目 MovieLens-100K MovieLens-1M Douban
用户数 943 6 040 6 214
多模态电影数 1 681 3 883 6 819
评分数 99 991 1 000 209 286 165
平均交互数 106.035 0 165.597 5 46.051 7
Statistics for the Dataset
模型 年份 属性信息 评分值 文本信息 图像信息
BPR[3] 2009 × × ×
NAIS[5] 2018 × × ×
VBPR[17] 2016 × ×
ConvMF[18] 2016 × ×
DQN[26] 2015
DRR[15] 2020 × × ×
Compared Models
模型 Precision@5 Precision@10
ML-1M ML-100K Douban ML-1M ML-100K Douban
BPR 0.242 6 0.248 1 0.184 6 0.201 0 0.196 2 0.157 2
NAIS 0.228 2 0.219 9 0.202 1 0.205 3 0.179 0 0.150 6
VBPR 0.282 0 0.277 4 0.216 2 0.187 8 0.187 4 0.117 9
ConvMF 0.266 9 0.250 7 0.312 5 0.211 8 0.206 7 0.305 1
DQN 0.339 1 0.321 9 0.305 5 0.395 0 0.312 8 0.303 3
DRR 0.421 7 0.354 3 0.305 4 0.446 4 0.326 6 0.295 1
M2DR-RM 0.539 5 0.519 4 0.419 0 0.507 1 0.494 7 0.383 0
Accuracy Results of Different Models on Different Datasets
模型 NDCG@5 NDCG@10
ML-1M ML-100K Douban ML-1M ML-100K Douban
BPR 0.268 2 0.294 3 0.280 8 0.255 8 0.292 0 0.234 2
NAIS 0.251 3 0.258 6 0.254 2 0.258 1 0.257 9 0.251 6
VBPR 0.411 6 0.411 9 0.368 0 0.320 3 0.333 8 0.301 1
ConvMF 0.408 1 0.405 2 0.472 9 0.318 4 0.317 9 0.469 4
DQN 0.610 0 0.547 3 0.479 6 0.682 1 0.655 1 0.491 5
DRR 0.680 7 0.597 6 0.498 3 0.700 1 0.612 1 0.482 8
M2DR-RM 0.733 5 0.677 9 0.562 3 0.702 2 0.656 8 0.544 0
NDCG Results of Different Models on Different Datasets
Ablation Experiment Results
The Results of Parameter T on Three Datasets
Average Recommendation Time per User on MovieLens-1M Dataset
Comparison Results of Recommended Strategies
[1] 中国互联网信息中心. 第49次中国互联网络发展状况统计报告[R/OL]. [2023-03-06].
[1] (China Internet Network Information Center. Statistical Report of the 49rd Chinese Internet Development[R/OL]. [2023-03-06]. )
[2] Linden G, Smith B, York J. Recommendations: Item-to-Item Collaborative Filtering[J]. IEEE Internet Computing, 2003, 7(1): 76-80.
doi: 10.1109/MIC.2003.1167344
[3] Rendle S, Freudenthaler C, Gantner Z, et al. BPR: Bayesian Personalized Ranking from Implicit Feedback[C]// Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. 2009: 452-461.
[4] Cheng H T, Koc L, Harmsen J, et al. Wide&Deep Learning for Recommender Systems[C]// Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. 2016: 7-10.
[5] He X N, He Z K, Song J K, et al. NAIS: Neural Attentive Item Similarity Model for Recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(12): 2354-2366.
doi: 10.1109/TKDE.2018.2831682
[6] 余力, 杜启翰, 岳博妍, 等. 基于强化学习的推荐研究综述[J]. 计算机科学, 2021, 48(10): 1-18.
doi: 10.11896/jsjkx.210200085
[6] (Yu Li, Du Qihan, Yue Boyan, et al. Survey of Reinforcement Learning Based Recommender Systems[J]. Computer Science, 2021, 48(10): 1-18.)
doi: 10.11896/jsjkx.210200085
[7] Maillard O A, Munos R, Ryabko D. Selecting the State-Representation in Reinforcement Learning[C]// Proceedings of the 24th International Conference on Neural Information Processing Systems. 2011: 2627-2635.
[8] Gao J, Li P, Chen Z K, et al. A Survey on Deep Learning for Multimodal Data Fusion[J]. Neural Computation, 2020, 32(5): 829-864.
doi: 10.1162/neco_a_01273 pmid: 32186998
[9] Zheng G J, Zhang F Z, Zheng Z H, et al. DRN: A Deep Reinforcement Learning Framework for News Recommendation[C]// Proceedings of the 2018 World Wide Web Conference. 2018: 167-176.
[10] Zhao X Y, Zhang L, Ding Z Y, et al. Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018: 1040-1048.
[11] Zhang Y Y, Su X Y, Liu Y. A Novel Movie Recommendation System Based on Deep Reinforcement Learning with Prioritized Experience Replay[C]// Proceedings of 2019 IEEE 19th International Conference on Communication Technology. 2019: 1496-1500.
[12] 阎世宏, 马为之, 张敏, 等. 结合用户长短期兴趣的深度强化学习推荐方法[J]. 中文信息学报, 2021, 35(8): 107-116.
[12] (Yan Shihong, Ma Weizhi, Zhang Min, et al. Reinforcement Learning with User Long-Term and Short-Term Preference for Personalized Recommendation[J]. Journal of Chinese Information Processing, 2021, 35(8): 107-116.)
[13] Zhao X Y, Xia L, Zhang L, et al. Deep Reinforcement Learning for Page-Wise Recommendations[C]// Proceedings of the 12th ACM Conference on Recommender Systems. 2018: 95-103.
[14] Zhou Q L. A Novel Movies Recommendation Algorithm Based on Reinforcement Learning with DDPG Policy[J]. International Journal of Intelligent Computing and Cybernetics, 2020, 13(1): 67-79.
doi: 10.1108/IJICC-09-2019-0103
[15] Liu F, Tang R M, Li X T, et al. State Representation Modeling for Deep Reinforcement Learning Based Recommendation[J]. Knowledge-Based Systems, 2020, 205: 106170.
doi: 10.1016/j.knosys.2020.106170
[16] Xin X, Karatzoglou A, Arapakis I, et al. Supervised Advantage Actor-Critic for Recommender Systems[C]// Proceedings of the 15th ACM International Conference on Web Search and Data Mining. 2022: 1186-1196.
[17] He R N, McAuley J. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016: 144-150.
[18] Kim D, Park C, Oh J, et al. Convolutional Matrix Factorization for Document Context-Aware Recommendation[C]// Proceedings of the 10th ACM Conference on Recommender Systems. 2016: 233-240.
[19] 马莹雪, 甘明鑫, 肖克峻. 融合标签和内容信息的矩阵分解推荐方法[J]. 数据分析与知识发现, 2021, 5(5): 71-82.
[19] (Ma Yingxue, Gan Mingxin, Xiao Kejun. A Matrix Factorization Recommendation Method with Tags and Contents[J]. Data Analysis and Knowledge Discovery, 2021, 5(5): 71-82.)
[20] Mangolin R B, Pereira R M, Jr Britto A S, et al. A Multimodal Approach for Multi-Label Movie Genre Classification[J]. Multimedia Tools and Applications, 2022, 81(14): 19071-19096.
doi: 10.1007/s11042-020-10086-2
[21] Wang J H, Wu Y T, Wang L. Predicting Implicit User Preferences with Multimodal Feature Fusion for Similar User Recommendation in Social Media[J]. Applied Sciences, 2021, 11(3): 1064.
doi: 10.3390/app11031064
[22] Wang Z, Chen H L, Li Z, et al. VRConvMF: Visual Recurrent Convolutional Matrix Factorization for Movie Recommendation[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2022, 6(3): 519-529..
doi: 10.1109/TETCI.2021.3102619
[23] 韩滕跃, 牛少彰, 张文. 基于对比学习的多模态序列推荐算法[J]. 计算机应用, 2022, 42(6): 1683-1688.
doi: 10.11772/j.issn.1001-9081.2021081417
[23] (Han Tengyue, Niu Shaozhang, Zhang Wen. Multimodal Sequential Recommendation Algorithm Based on Contrastive Learning[J]. Journal of Computer Applications, 2022, 42(6): 1683-1688.)
doi: 10.11772/j.issn.1001-9081.2021081417
[24] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2. 2013: 3111-3119.
[25] Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous Control with Deep Reinforcement Learning[OL]. arXiv Preprint, arXiv:1509.02971.
[26] Mnih V, Kavukcuoglu K, Silver D, et al. Human-Level Control Through Deep Reinforcement Learning[J]. Nature, 2015, 518(7540): 529-533.
doi: 10.1038/nature14236
[1] Chen Wenjie. Scientific Collaboration Recommendation Based on Hypergraph[J]. 数据分析与知识发现, 2023, 7(4): 68-76.
[2] Tian Tianjunzi, Zhu Xuefang. Session-Based Recommendation Algorithm for Repeat Consumption Scenarios[J]. 数据分析与知识发现, 2023, 7(4): 89-100.
[3] Li Haojun, Lv Yun, Wang Xuhui, Huang Jieya. A Deep Recommendation Model with Multi-Layer Interaction and Sentiment Analysis[J]. 数据分析与知识发现, 2023, 7(3): 43-57.
[4] Cheng Quan, She Dexin. Drug Recommendation Based on Graph Neural Network with Patient Signs and Medication Data[J]. 数据分析与知识发现, 2022, 6(9): 113-124.
[5] Wang Dailin, Liu Lina, Liu Meiling, Liu Yaqiu. Reader Preference Analysis and Book Recommendation Model with Attention Mechanism of Catalogs[J]. 数据分析与知识发现, 2022, 6(9): 138-152.
[6] Tang Jiao, Zhang Lisheng, Sang Chunyan. News Recommendation with Latent Topic Distribution and Long and Short-Term User Representations[J]. 数据分析与知识发现, 2022, 6(9): 52-64.
[7] Ding Hao, Hu Guangwei, Qi Jianglei, Zhuang Guangguang. Recommending Medical Literature with Random Forest Model and Query Expansion[J]. 数据分析与知识发现, 2022, 6(7): 32-43.
[8] Zhang Ruoqi, Shen Jianfang, Chen Pinghua. Session Sequence Recommendation with GNN, Bi-GRU and Attention Mechanism[J]. 数据分析与知识发现, 2022, 6(6): 46-54.
[9] Guo Lei, Liu Wenju, Wang Ze, Ren Yueqiang. Point-of-Interest Recommendation with Spectral Clustering and Multi-Factors[J]. 数据分析与知识发现, 2022, 6(5): 77-88.
[10] Zheng Xiao, Li Shuqing, Zhang Zhiwang. Measuring User Item Quality with Rating Analysis for Deep Recommendation Model[J]. 数据分析与知识发现, 2022, 6(4): 39-48.
[11] Ding Hao, Hu Guangwei, Wang Ting, Suo Wei. Recommendation Method for Potential Factor Model Based on Time Series Drift[J]. 数据分析与知识发现, 2022, 6(10): 1-8.
[12] Li Zhi, Sun Rui, Yao Yuxuan, Li Xiaohuan. Recommending Point-of-Interests with Real-Time Event Detection[J]. 数据分析与知识发现, 2022, 6(10): 114-127.
[13] Dong Wenhui, Xiong Huixiang, Du Jin, Wang Niuniu. Recommending Research Collaborators Based on Scholar Profiling[J]. 数据分析与知识发现, 2022, 6(10): 20-34.
[14] Wang Qinjie, Qin Chunxiu, Ma Xubu, Liu Huailiang, Xu Cunzhen. Recommending Scientific Literature Based on Author Preference and Heterogeneous Information Network[J]. 数据分析与知识发现, 2021, 5(8): 54-64.
[15] Ruan Xiaoyun,Liao Jianbin,Li Xiang,Yang Yang,Li Daifeng. Interpretable Recommendation of Reinforcement Learning Based on Talent Knowledge Graph Reasoning[J]. 数据分析与知识发现, 2021, 5(6): 36-50.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938