Please wait a minute...
Advanced Search
数据分析与知识发现  2023, Vol. 7 Issue (4): 114-128     https://doi.org/10.11925/infotech.2096-3467.2022.0479
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
融合多模态特征的深度强化学习推荐模型*
潘华莉1,谢珺1(),高婧1,续欣莹2,王长征3
1太原理工大学信息与计算机学院 晋中 030600
2太原理工大学电气与动力工程学院 太原 030024
3山西同方知网数字出版技术有限公司 太原 030000
A Deep Reinforcement Learning Recommendation Model with Multi-modal Features
Pan Huali1,Xie Jun1(),Gao Jing1,Xu Xinying2,Wang Changzheng3
1College of Information and Computer, Taiyuan University of Technology, Jinzhong 030600,China
2College of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan 030024,China
3Shanxi Tongfang Knowledge Network Digital Publishing Technology Co., Ltd., Taiyuan 030000, China
全文: PDF (1821 KB)   HTML ( 26
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 利用多模态特征融合和深度强化学习缓解数据稀疏性和用户兴趣偏好动态变化问题。【方法】 使用预训练模型和注意力机制分别实现模态内表征和三模态间融合,同时建模用户-项目交互,并利用深度强化学习算法实时捕捉用户兴趣漂移和长短期奖励实现个性化推荐。【结果】 较对比模型中最高值,所提模型在MovieLens-1M、MovieLens-100K和Douban数据集上的Precision@5分别提高11.8%、16.5%和11.4%,NDCG@5分别提高5.3%、8.0%和6.4%。【局限】 Douban数据集的用户交互历史较少,所提模型在训练过程中无法学习到更准确的用户偏好,与在MovieLens数据集上的实验相比,推荐结果受限。【结论】 所提模型融合项目多模态信息重构深度强化学习的状态表示网络,改善了推荐效果。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
潘华莉
谢珺
高婧
续欣莹
王长征
关键词 推荐深度强化学习多模态特征融合用户-推荐系统交互    
Abstract

[Objective] This paper addresses data sparsity and dynamic changes in user interests with multimodal feature fusion and deep reinforcement learning. [Methods] First, we used a pre-trained model and attention mechanism to achieve intra-modal representation and fusion of three modalities. Then, we created a model for user-item interactions. Finally, we utilized the deep reinforcement learning algorithm to capture user interest drift and long and short-term rewards in real time to achieve personalized recommendations. [Results] Compared with the highest value in the baseline models, the proposed model improved precision@5 by 11.8%, 16.5%, 11.4%, and NDCG@5 by 5.3%, 8.0%, 6.4%, on the MovieLens-1M, MovieLens-100K, and Douban datasets, respectively. [Limitations] The user interaction history in the Douban dataset is relatively small, and the proposed model cannot learn more accurate user preferences during training. Compared with the experiments on the MovieLens dataset, we received limited recommendation results. [Conclusions] The proposed model integrates multimodal information to reconstruct the state representation network of deep reinforcement learning, improving the recommendation effect.

Key wordsRecommendation    Deep Reinforcement Learning    Multimodal Feature Fusion    User-Recommender System Interaction
收稿日期: 2022-05-12      出版日期: 2022-11-09
ZTFLH:  TP391  
基金资助:*虚拟现实技术与系统国家重点实验室(北京航空航天大学)开放课题基金项目(VRLAB2022C11);山西省回国留学人员科研资助项目(2020-040);山西省科技合作交流专项项目的研究成果之一(202104041101030)
通讯作者: 谢珺,ORCID:0000-0003-0955-9970,E-mail:xiejun@tyut.edu.cn   
引用本文:   
潘华莉, 谢珺, 高婧, 续欣莹, 王长征. 融合多模态特征的深度强化学习推荐模型*[J]. 数据分析与知识发现, 2023, 7(4): 114-128.
Pan Huali, Xie Jun, Gao Jing, Xu Xinying, Wang Changzheng. A Deep Reinforcement Learning Recommendation Model with Multi-modal Features. Data Analysis and Knowledge Discovery, 2023, 7(4): 114-128.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0479      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2023/V7/I4/114
Fig.1  融合多模态特征的深度强化学习推荐模型
Fig.2  用户与推荐系统交互过程
Fig.3  PV-DM模型结构
Fig.4  VGG-16模型结构
Fig.5  电影多模态表征和交互融合网络
Fig.6  用户-项目交互建模
项目 MovieLens-100K MovieLens-1M Douban
用户数 943 6 040 6 214
多模态电影数 1 681 3 883 6 819
评分数 99 991 1 000 209 286 165
平均交互数 106.035 0 165.597 5 46.051 7
Table 1  数据集的统计信息
模型 年份 属性信息 评分值 文本信息 图像信息
BPR[3] 2009 × × ×
NAIS[5] 2018 × × ×
VBPR[17] 2016 × ×
ConvMF[18] 2016 × ×
DQN[26] 2015
DRR[15] 2020 × × ×
M2DR-RM
Table 2  对比模型
模型 Precision@5 Precision@10
ML-1M ML-100K Douban ML-1M ML-100K Douban
BPR 0.242 6 0.248 1 0.184 6 0.201 0 0.196 2 0.157 2
NAIS 0.228 2 0.219 9 0.202 1 0.205 3 0.179 0 0.150 6
VBPR 0.282 0 0.277 4 0.216 2 0.187 8 0.187 4 0.117 9
ConvMF 0.266 9 0.250 7 0.312 5 0.211 8 0.206 7 0.305 1
DQN 0.339 1 0.321 9 0.305 5 0.395 0 0.312 8 0.303 3
DRR 0.421 7 0.354 3 0.305 4 0.446 4 0.326 6 0.295 1
M2DR-RM 0.539 5 0.519 4 0.419 0 0.507 1 0.494 7 0.383 0
Table 3  不同模型在不同数据集上的精确率结果对比
模型 NDCG@5 NDCG@10
ML-1M ML-100K Douban ML-1M ML-100K Douban
BPR 0.268 2 0.294 3 0.280 8 0.255 8 0.292 0 0.234 2
NAIS 0.251 3 0.258 6 0.254 2 0.258 1 0.257 9 0.251 6
VBPR 0.411 6 0.411 9 0.368 0 0.320 3 0.333 8 0.301 1
ConvMF 0.408 1 0.405 2 0.472 9 0.318 4 0.317 9 0.469 4
DQN 0.610 0 0.547 3 0.479 6 0.682 1 0.655 1 0.491 5
DRR 0.680 7 0.597 6 0.498 3 0.700 1 0.612 1 0.482 8
M2DR-RM 0.733 5 0.677 9 0.562 3 0.702 2 0.656 8 0.544 0
Table 4  不同模型在不同数据集上的NDCG结果对比
Fig.7  消融实验结果
Fig.8  参数 T在三个数据集上的结果
Fig.9  在MovieLens-1M数据集上平均每个用户的推荐时间
Fig.10  推荐策略对比结果
[1] 中国互联网信息中心. 第49次中国互联网络发展状况统计报告[R/OL]. [2023-03-06]. https://www.cnnic.net.cn/NMediaFile/old_attach/P020220721404263787858.pdf.
[1] (China Internet Network Information Center. Statistical Report of the 49rd Chinese Internet Development[R/OL]. [2023-03-06]. https://www.cnnic.net.cn/NMediaFile/old_attach/P020220721404263787858.pdf. )
[2] Linden G, Smith B, York J. Amazon.com Recommendations: Item-to-Item Collaborative Filtering[J]. IEEE Internet Computing, 2003, 7(1): 76-80.
doi: 10.1109/MIC.2003.1167344
[3] Rendle S, Freudenthaler C, Gantner Z, et al. BPR: Bayesian Personalized Ranking from Implicit Feedback[C]// Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. 2009: 452-461.
[4] Cheng H T, Koc L, Harmsen J, et al. Wide&Deep Learning for Recommender Systems[C]// Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. 2016: 7-10.
[5] He X N, He Z K, Song J K, et al. NAIS: Neural Attentive Item Similarity Model for Recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(12): 2354-2366.
doi: 10.1109/TKDE.2018.2831682
[6] 余力, 杜启翰, 岳博妍, 等. 基于强化学习的推荐研究综述[J]. 计算机科学, 2021, 48(10): 1-18.
doi: 10.11896/jsjkx.210200085
[6] (Yu Li, Du Qihan, Yue Boyan, et al. Survey of Reinforcement Learning Based Recommender Systems[J]. Computer Science, 2021, 48(10): 1-18.)
doi: 10.11896/jsjkx.210200085
[7] Maillard O A, Munos R, Ryabko D. Selecting the State-Representation in Reinforcement Learning[C]// Proceedings of the 24th International Conference on Neural Information Processing Systems. 2011: 2627-2635.
[8] Gao J, Li P, Chen Z K, et al. A Survey on Deep Learning for Multimodal Data Fusion[J]. Neural Computation, 2020, 32(5): 829-864.
doi: 10.1162/neco_a_01273 pmid: 32186998
[9] Zheng G J, Zhang F Z, Zheng Z H, et al. DRN: A Deep Reinforcement Learning Framework for News Recommendation[C]// Proceedings of the 2018 World Wide Web Conference. 2018: 167-176.
[10] Zhao X Y, Zhang L, Ding Z Y, et al. Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018: 1040-1048.
[11] Zhang Y Y, Su X Y, Liu Y. A Novel Movie Recommendation System Based on Deep Reinforcement Learning with Prioritized Experience Replay[C]// Proceedings of 2019 IEEE 19th International Conference on Communication Technology. 2019: 1496-1500.
[12] 阎世宏, 马为之, 张敏, 等. 结合用户长短期兴趣的深度强化学习推荐方法[J]. 中文信息学报, 2021, 35(8): 107-116.
[12] (Yan Shihong, Ma Weizhi, Zhang Min, et al. Reinforcement Learning with User Long-Term and Short-Term Preference for Personalized Recommendation[J]. Journal of Chinese Information Processing, 2021, 35(8): 107-116.)
[13] Zhao X Y, Xia L, Zhang L, et al. Deep Reinforcement Learning for Page-Wise Recommendations[C]// Proceedings of the 12th ACM Conference on Recommender Systems. 2018: 95-103.
[14] Zhou Q L. A Novel Movies Recommendation Algorithm Based on Reinforcement Learning with DDPG Policy[J]. International Journal of Intelligent Computing and Cybernetics, 2020, 13(1): 67-79.
doi: 10.1108/IJICC-09-2019-0103
[15] Liu F, Tang R M, Li X T, et al. State Representation Modeling for Deep Reinforcement Learning Based Recommendation[J]. Knowledge-Based Systems, 2020, 205: 106170.
doi: 10.1016/j.knosys.2020.106170
[16] Xin X, Karatzoglou A, Arapakis I, et al. Supervised Advantage Actor-Critic for Recommender Systems[C]// Proceedings of the 15th ACM International Conference on Web Search and Data Mining. 2022: 1186-1196.
[17] He R N, McAuley J. VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016: 144-150.
[18] Kim D, Park C, Oh J, et al. Convolutional Matrix Factorization for Document Context-Aware Recommendation[C]// Proceedings of the 10th ACM Conference on Recommender Systems. 2016: 233-240.
[19] 马莹雪, 甘明鑫, 肖克峻. 融合标签和内容信息的矩阵分解推荐方法[J]. 数据分析与知识发现, 2021, 5(5): 71-82.
[19] (Ma Yingxue, Gan Mingxin, Xiao Kejun. A Matrix Factorization Recommendation Method with Tags and Contents[J]. Data Analysis and Knowledge Discovery, 2021, 5(5): 71-82.)
[20] Mangolin R B, Pereira R M, Jr Britto A S, et al. A Multimodal Approach for Multi-Label Movie Genre Classification[J]. Multimedia Tools and Applications, 2022, 81(14): 19071-19096.
doi: 10.1007/s11042-020-10086-2
[21] Wang J H, Wu Y T, Wang L. Predicting Implicit User Preferences with Multimodal Feature Fusion for Similar User Recommendation in Social Media[J]. Applied Sciences, 2021, 11(3): 1064.
doi: 10.3390/app11031064
[22] Wang Z, Chen H L, Li Z, et al. VRConvMF: Visual Recurrent Convolutional Matrix Factorization for Movie Recommendation[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2022, 6(3): 519-529..
doi: 10.1109/TETCI.2021.3102619
[23] 韩滕跃, 牛少彰, 张文. 基于对比学习的多模态序列推荐算法[J]. 计算机应用, 2022, 42(6): 1683-1688.
doi: 10.11772/j.issn.1001-9081.2021081417
[23] (Han Tengyue, Niu Shaozhang, Zhang Wen. Multimodal Sequential Recommendation Algorithm Based on Contrastive Learning[J]. Journal of Computer Applications, 2022, 42(6): 1683-1688.)
doi: 10.11772/j.issn.1001-9081.2021081417
[24] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2. 2013: 3111-3119.
[25] Lillicrap T P, Hunt J J, Pritzel A, et al. Continuous Control with Deep Reinforcement Learning[OL]. arXiv Preprint, arXiv:1509.02971.
[26] Mnih V, Kavukcuoglu K, Silver D, et al. Human-Level Control Through Deep Reinforcement Learning[J]. Nature, 2015, 518(7540): 529-533.
doi: 10.1038/nature14236
[1] 陈文杰. 基于超图的科研合作推荐研究*[J]. 数据分析与知识发现, 2023, 7(4): 68-76.
[2] 田甜俊子, 朱学芳. 面向重复消费场景的会话推荐算法研究[J]. 数据分析与知识发现, 2023, 7(4): 89-100.
[3] 李浩君, 吕韵, 汪旭辉, 黄诘雅. 融入情感分析的多层交互深度推荐模型研究*[J]. 数据分析与知识发现, 2023, 7(3): 43-57.
[4] 成全, 佘德昕. 融合患者体征与用药数据的图神经网络药物推荐方法研究*[J]. 数据分析与知识发现, 2022, 6(9): 113-124.
[5] 王代琳, 刘丽娜, 刘美玲, 刘亚秋. 基于图书目录注意力机制的读者偏好分析与推荐模型研究[J]. 数据分析与知识发现, 2022, 6(9): 138-152.
[6] 唐娇, 张力生, 桑春艳. 基于潜在主题分布和长、短期用户表示的新闻推荐模型*[J]. 数据分析与知识发现, 2022, 6(9): 52-64.
[7] 丁浩, 胡广伟, 齐江蕾, 庄光光. 基于随机森林和关键词查询扩展的医学文献推荐方法*[J]. 数据分析与知识发现, 2022, 6(7): 32-43.
[8] 张若琦, 申建芳, 陈平华. 结合GNN、Bi-GRU及注意力机制的会话序列推荐*[J]. 数据分析与知识发现, 2022, 6(6): 46-54.
[9] 郭蕾, 刘文菊, 王赜, 任悦强. 融合谱聚类和多因素影响的兴趣点推荐方法*[J]. 数据分析与知识发现, 2022, 6(5): 77-88.
[10] 郑潇, 李树青, 张志旺. 基于评分数值分析的用户项目质量测度及其在深度推荐模型中的应用*[J]. 数据分析与知识发现, 2022, 6(4): 39-48.
[11] 丁浩, 胡广伟, 王婷, 索炜. 基于时序漂移的潜在因子模型推荐方法*[J]. 数据分析与知识发现, 2022, 6(10): 1-8.
[12] 李治, 孙锐, 姚羽轩, 李小欢. 基于实时事件侦测的兴趣点推荐系统研究*[J]. 数据分析与知识发现, 2022, 6(10): 114-127.
[13] 董文慧, 熊回香, 杜瑾, 王妞妞. 基于学者画像的科研合作者推荐研究*[J]. 数据分析与知识发现, 2022, 6(10): 20-34.
[14] 王勤洁, 秦春秀, 马续补, 刘怀亮, 徐存真. 基于作者偏好和异构信息网络的科技文献推荐方法研究*[J]. 数据分析与知识发现, 2021, 5(8): 54-64.
[15] 阮小芸,廖健斌,李祥,杨阳,李岱峰. 基于人才知识图谱推理的强化学习可解释推荐研究*[J]. 数据分析与知识发现, 2021, 5(6): 36-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn