Please wait a minute...
Advanced Search
数据分析与知识发现  2022, Vol. 6 Issue (9): 52-64     https://doi.org/10.11925/infotech.2096-3467.2021.1376
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于潜在主题分布和长、短期用户表示的新闻推荐模型*
唐娇,张力生,桑春艳()
重庆邮电大学软件工程学院 重庆 400065
News Recommendation with Latent Topic Distribution and Long and Short-Term User Representations
Tang Jiao,Zhang Lisheng,Sang Chunyan()
School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
全文: PDF (1109 KB)   HTML ( 31
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 在充分利用新闻文本内容、附加信息的基础上,探究用户的当前关注和稳定偏好,弥补现有新闻推荐方法在利用新闻内容信息和探索用户长、短期混合兴趣等方面的不足。【方法】 构建了一种融合不同类型新闻信息的新闻表示模型,对新闻的标题、摘要、正文等文本内容,以及显式主题、潜在主题等附加信息进行有效利用;在此基础上,构建一种可以刻画用户长、短期兴趣用户表示模型,探究用户的当前关注和稳定偏好。【结果】 所提模型在两个大规模新闻推荐数据集上的性能分别达到了69.51%(AUC)、34.09%(MRR)、37.25%(nDCG@5)、43.01%(nDCG@10)以及66.05%(AUC)、30.93%(MRR)、34.30%(nDCG@5)、40.46%(nDCG@10),稳定超越7个基准模型。【局限】 对历史行为稀疏的用户考虑不足,后续可针对用户冷启动场景作出适当推荐。【结论】 所提模型利用先进的自然语言处理技术学习了信息量比较丰富的新闻和用户表示向量,其设计思路能有效提高新闻推荐的性能。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
唐娇
张力生
桑春艳
关键词 新闻推荐主题模型神经网络注意力机制    
Abstract

[Objective] This paper proposes a news recommendation model based on contents and additional information on users’ current preferences, aiming to improve the performance of the existing ones. [Methods] We estblished a news representation model integrating the titles, abstracts, full-texts, as well as explicit and potential topics. We also built a user representation model utilizing the long and short-term user interests as well as their current concerns and preferences. [Results] We examined the proposed model with two large-scale news recommendation datasets. It reached 69.51% on AUC, 34.09% on MRR, 37.25% on nDCG@5, and 43.01% on nDCG@10 with the first dataset. For the second one, we had 66.05% on AUC, 30.93% on MRR, 34.30% on nDCG@5, and 40.46% on nDCG@10, which were all higher than the seven baseline models. [Limitations] More research is needed to study users with few historical behaviors. [Conclusions] The proposed model could create vectors for news contents and user representations using advanced natural language processing techniques. It also effectively improves the performance of news recommendation models.

Key wordsNews Recommendation    Topic Model    Neural Network    Attention Mechanism
收稿日期: 2021-12-05      出版日期: 2022-10-26
ZTFLH:  TP393  
  G250  
基金资助:*国家自然科学基金项目(62002037);重庆市自然科学基金项目(cstc2019jcyj-msxmX0588)
通讯作者: 桑春艳,ORCID:0000-0001-8338-7770     E-mail: sangcy@cqupt.edu.cn
引用本文:   
唐娇, 张力生, 桑春艳. 基于潜在主题分布和长、短期用户表示的新闻推荐模型*[J]. 数据分析与知识发现, 2022, 6(9): 52-64.
Tang Jiao, Zhang Lisheng, Sang Chunyan. News Recommendation with Latent Topic Distribution and Long and Short-Term User Representations. Data Analysis and Knowledge Discovery, 2022, 6(9): 52-64.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021.1376      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2022/V6/I9/52
符号 描述
u 目标用户
d 候选新闻
r u u的表示向量
r d d的表示向量
w i 单词序列中第 i个单词
e i w i的词向量表示
c i w i的上下文表示
α i w i的注意力权重
w c , w s c 描述类别、子类别的单词
e c , e s c w c , w s c的词向量表示
θ d 新闻 d的潜在主题分布向量
z d , i 新闻 d属于潜在主题 i的概率
α T , α A , α C , α S C , α Z 标题、摘要、类别、子类别、潜在主题的注意力权重
r T , r A , r C , r S C , r Z 标题、摘要、类别、子类别、潜在主题的表示向量
m 用户 u的历史点击序列长度
d i 历史点击序列中的第 i条新闻
r i d i的表示向量
h i 用户在第 i时刻的兴趣特征向量
r S 用户 u的短期兴趣表示
r L 用户 u的长期兴趣表示
Table 1  符号的定义和描述
Fig.1  NLTLS模型框架
统计信息 MIND MINDsmall
用户数量 1 000 000 94 057
新闻数量 161 013 65 238
点击会话数量 24 155 470 230 117
新闻信息 标题、摘要、类别、
子类别、正文
标题、摘要、类别、
子类别、正文
Table 2  MIND和MINDsmall数据集的统计信息
模型 MIND MINDsmall
AUC MRR nDCG@5 nDCG@10 AUC MRR nDCG@5 nDCG@10
LightGCN 0.629 1 0.294 7 0.315 9 0.373 5 0.499 7 0.219 3 0.223 8 0.286 8
DKN 0.634 2 0.296 7 0.318 3 0.376 0 0.609 3 0.276 2 0.301 9 0.366 4
NPA 0.635 4 0.298 6 0.320 5 0.378 4 0.583 9 0.2606 9 0.279 1 0.339 9
NRMS 0.646 5 0.307 8 0.332 1 0.389 7 0.616 2 0.273 9 0.298 7 0.364 8
LSTUR 0.656 7 0.312 0 0.337 2 0.394 6 0.615 8 0.281 1 0.303 7 0.366 5
GERL 0.666 1 0.320 9 0.348 9 0.406 2 0.600 7 0.272 3 0.291 4 0.355 9
NAML 0.687 0 0.336 2 0.366 7 0.423 6 0.643 5 0.295 5 0.321 9 0.386 7
NLTLS 0.695 1 0.340 9 0.372 5 0.430 1 0.660 5 0.309 3 0.343 0 0.404 6
Table 3  不同模型的实验结果比较
NLTLS变体 MIND MINDsmall
AUC MRR nDCG@5 nDCG@10 AUC MRR nDCG@5 nDCG@10
仅标题 0.635 4 0.298 6 0.320 5 0.378 4 0.629 3 0.284 3 0.310 2 0.375 6
仅类别 0.656 0 0.314 6 0.340 3 0.397 4 0.621 5 0.289 7 0.317 5 0.378 9
仅摘要 0.658 4 0.315 2 0.340 7 0.398 3 0.633 7 0.293 6 0.321 6 0.384 5
仅潜在主题 0.658 8 0.314 0 0.338 5 0.396 8 0.634 7 0.303 2 0.330 9 0.393 7
仅求和平均 0.673 6 0.326 0 0.353 6 0.411 7 0.641 5 0.290 0 0.323 0 0.385 1
仅注意力机制 0.681 8 0.331 6 0.361 4 0.418 8 0.645 6 0.301 2 0.332 4 0.395 5
仅短期兴趣 0.692 6 0.339 5 0.370 5 0.427 7 0.640 9 0.299 0 0.329 5 0.392 2
完整NLTLS 0.695 1 0.340 9 0.372 5 0.430 1 0.660 5 0.309 3 0.343 0 0.404 6
Table 4  NLTLS各种变体的实验结果比较
Fig.2  CNN卷积核数量和窗口大小对模型的影响
Fig.3  历史点击序列长度对模型的影响
Fig.4  类别嵌入维度对模型的影响
Fig.5  单词序列长度 n对模型的影响
[1] 项亮. 推荐系统实践[M]. 北京: 人民邮电出版社, 2012: 18-20.
[1] ( Xiang Liang. Recommended System Practices[M]. Beijing: The People’s Posts and Telecommunications Press, 2012: 18-20.)
[2] Resnick P, Varian H R. Recommender Systems[J]. Communications of the ACM, 1997, 40(3): 56-58.
[3] Adomavicius G, Tuzhilin A. Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(6): 734-749.
doi: 10.1109/TKDE.2005.99
[4] Konstan J A, Miller B N, Maltz D, et al. GroupLens: Applying Collaborative Filtering to Usenet News[J]. Communications of the ACM, 1997, 40(3): 77-87.
doi: 10.1145/245108.245126
[5] Marlin B, Zemel R S. The Multiple Multiplicative Factor Model for Collaborative Filtering[C]// Proceedings of the 21st International Conference on Machine Learning. 2004: 576-583.
[6] Das A S, Datar M, Garg A, et al. Google News Personalization: Scalable Online Collaborative Filtering[C]// Proceedings of the 16th International Conference on World Wide Web. 2007: 271-280.
[7] He X N, Liao L Z, Zhang H W, et al. Neural Collaborative Filtering[C]// Proceedings of the 26th International Conference on World Wide Web. 2017: 173-182.
[8] Wang X, He X N, Wang M, et al. Neural Graph Collaborative Filtering[C]// Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019: 165-174.
[9] He X N, Deng K, Wang X, et al. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation[C]// Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020: 639-648.
[10] Wang H W, Zhang F Z, Xie X, et al. DKN: Deep Knowledge-Aware Network for News Recommendation[C]// Proceedings of the 2018 World Wide Web Conference. 2018: 1835-1844.
[11] Wu C H, Wu F Z, An M X, et al. Neural News Recommendation with Attentive Multi-View Learning[C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019: 3863-3869.
[12] Wu C H, Wu F Z, An M X, et al. NPA: Neural News Recommendation with Personalized Attention[C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019: 2576-2584.
[13] An M X, Wu F Z, Wu C H, et al. Neural News Recommendation with Long- and Short-Term User Representations[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019: 336-345.
[14] Wu C H, Wu F Z, Ge S Y, et al. Neural News Recommendation with Multi-Head Self-Attention[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. 2019: 6389-6394.
[15] Zhang L M, Liu P, Gulla J A. Dynamic Attention-Integrated Neural Network for Session-Based News Recommendation[J]. Machine Learning, 2019, 108(10): 1851-1875.
doi: 10.1007/s10994-018-05777-9
[16] Hu L M, Li C, Shi C, et al. Graph Neural News Recommendation with Long-Term and Short-Term Interest Modeling[J]. Information Processing & Management, 2020, 57(2): 102142.
doi: 10.1016/j.ipm.2019.102142
[17] Wu C H, Wu F Z, Huang Y F, et al. Neural News Recommendation with Negative Feedback[J]. CCF Transactions on Pervasive Computing and Interaction, 2020, 2(3): 178-188.
doi: 10.1007/s42486-020-00044-0
[18] Wu C H, Wu F Z, Qi T, et al. User Modeling with Click Preference and Reading Satisfaction for News Recommendation[C]// Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2020: 3023-3029.
[19] Ge S Y, Wu C H, Wu F Z, et al. Graph Enhanced Representation Learning for News Recommendation[C]// Proceedings of the Web Conference 2020. 2020: 2863-2869.
[20] Liu D Y, Lian J X, Wang S Y, et al. KRED: Knowledge-Aware Document Representation for News Recommendations[C]// Proceedings of the 14th ACM Conference on Recommender Systems. 2020: 200-209.
[21] Hu L M, Xu S Y, Li C, et al. Graph Neural News Recommendation with Unsupervised Preference Disentanglement[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020: 4255-4264.
[22] Raza S, Ding C. Deep Dynamic Neural Network to Trade-off Between Accuracy and Diversity in a News Recommender System[C]// Proceedings of 2021 IEEE International Conference on Big Data. 2021: 5246-5256.
[23] Wu L, He X N, Wang X, et al. A Survey on Neural Recommendation: From Collaborative Filtering to Content and Context Enriched Recommendation[OL]. arXiv Preprint, arXiv: 2104.13030.
[24] Cao S X, Yang N, Liu Z Z. Online News Recommender Based on Stacked Auto-Encoder[C]// Proceedings of 2017 IEEE/ACIS 16th International Conference on Computer and Information Science. 2017: 721-726.
[25] Wu F Z, Qiao Y, Chen J H, et al. MIND: A Large-Scale Dataset for News Recommendation[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2021: 3597-3606.
[26] 田萱, 丁琪, 廖子慧, 等. 基于深度学习的新闻推荐算法研究综述[J]. 计算机科学与探索, 2021, 15(6): 971-998.
doi: 10.3778/j.issn.1673-9418.2007021
[26] ( Tian Xuan, Ding Qi, Liao Zihui, et al. Survey on Deep Learning Based News Recommendation Algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2021, 15(6): 971-998.)
doi: 10.3778/j.issn.1673-9418.2007021
[27] Okura S, Tagami Y, Ono S, et al. Embedding-Based News Recommendation for Millions of Users[C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017: 1933-1942.
[28] Park K, Lee J, Choi J. Deep Neural Networks for News Recommendations[C]// Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017: 2255-2258.
[29] Sheu H S, Li S. Context-Aware Graph Embedding for Session-Based News Recommendation[C]// Proceedings of the 14th ACM Conference on Recommender Systems. 2020: 657-662.
[30] 吴方照, 武楚涵, 安鸣霄, 等. 基于深度学习的个性化新闻推荐[J]. 南京信息工程大学学报(自然科学版), 2019, 11(3): 278-285.
[30] Wu Fangzhao, Wu Chuhan, An Mingxiao, et al. Personalized News Recommendation Based on Deep Learning[J]. Journal of Nanjing University of Information Science & Technology (Natural Science Edition), 2019, 11(3): 278-285.)
[31] Huang P S, He X D, Gao J F, et al. Learning Deep Structured Semantic Models for Web Search Using Clickthrough Data[C]// Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. 2013: 2333-2338.
[32] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
[33] Burke R. Hybrid Recommender Systems: Survey and Experiments[J]. User Modeling and User-Adapted Interaction, 2002, 12(4): 331-370.
doi: 10.1023/A:1021240730564
[34] Bansal T, Das M, Bhattacharyya C. Content Driven User Profiling for Comment-Worthy Recommendations of News and Blog Articles[C]// Proceedings of the 9th ACM Conference on Recommender Systems. 2015: 195-202.
[35] Li L, Zheng L, Yang F, et al. Modeling and Broadening Temporal User Interest in Personalized News Recommendation[J]. Expert Systems with Applications, 2014, 41(7): 3168-3177.
doi: 10.1016/j.eswa.2013.11.020
[36] Phelan O, McCarthy K, Smyth B. Using Twitter to Recommend Real-Time Topical News[C]// Proceedings of the 3rd ACM Conference on Recommender Systems. 2009: 385-388.
[37] Son J W, Kim A Y, Park S B. A Location-Based News Article Recommendation with Explicit Localized Semantic Analysis[C]// Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2013: 293-302.
[38] Kim Y. Convolutional Neural Networks for Sentence Classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1746-1751.
[39] Cho K, van Merrienboer B, Gulcehre C, et al. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1724-1734.
[40] Bahdanau B, Kyunghyun C, Yoshua B. Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv: 1409.0473.
[41] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[42] Deng L Y. The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning[J]. Technometrics, 2006, 48(1): 147-148.
doi: 10.1198/tech.2006.s353
[43] Pennington J, Socher R, Manning C D. GloVe: Global Vectors for Word Representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1532-1543.
[44] Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: A Simple Way to Prevent Neural Networks from Overfitting[J]. The Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
[45] Kingma D P, Ba J. Adam: A Method for Stochastic Optimization[OL]. arXiv Preprint, arXiv: 1412.6980.
[1] 赵蕊洁, 佟昕瑀, 刘小桦, 路永和. 基于神经网络的医药科技论文实体识别与标注研究*[J]. 数据分析与知识发现, 2022, 6(9): 100-112.
[2] 成全, 佘德昕. 融合患者体征与用药数据的图神经网络药物推荐方法研究*[J]. 数据分析与知识发现, 2022, 6(9): 113-124.
[3] 游新冬, 袁梦龙, 张乐, 吕学强. CNN-SM:基于义原与多特征融合的消费品领域缺陷词识别模型*[J]. 数据分析与知识发现, 2022, 6(9): 77-85.
[4] 杨美芳, 杨波. 基于笔画ELMo嵌入IDCNN-CRF模型的企业风险领域实体抽取研究*[J]. 数据分析与知识发现, 2022, 6(9): 86-99.
[5] 赵鹏武, 李志义, 林小琦. 基于注意力机制和卷积神经网络的中文人物关系抽取与识别*[J]. 数据分析与知识发现, 2022, 6(8): 41-51.
[6] 周宁, 靳高雅, 石雯茜. 融合神经网络与全局推理的实体共指消解算法*[J]. 数据分析与知识发现, 2022, 6(8): 75-83.
[7] 杨文丽, 李娜娜. 基于对抗网络的文本对齐跨语言情感分类方法*[J]. 数据分析与知识发现, 2022, 6(7): 141-151.
[8] 叶瀚,孙海春,李欣,焦凯楠. 融合注意力机制与句向量压缩的长文本分类模型[J]. 数据分析与知识发现, 2022, 6(6): 84-94.
[9] 王丽, 刘细文. 基于专利数据的技术主题扩散量化研究与实现*[J]. 数据分析与知识发现, 2022, 6(6): 1-10.
[10] 张若琦, 申建芳, 陈平华. 结合GNN、Bi-GRU及注意力机制的会话序列推荐*[J]. 数据分析与知识发现, 2022, 6(6): 46-54.
[11] 郭樊容, 黄孝喜, 王荣波, 谌志群, 胡创, 谢一敏, 司博宇. 基于Transformer和图卷积神经网络的隐喻识别*[J]. 数据分析与知识发现, 2022, 6(4): 120-129.
[12] 郭航程, 何彦青, 兰天, 吴振峰, 董诚. 基于Paragraph-BERT-CRF的科技论文摘要语步功能信息识别方法研究*[J]. 数据分析与知识发现, 2022, 6(2/3): 298-307.
[13] 徐月梅, 樊祖薇, 曹晗. 基于标签嵌入注意力机制的多任务文本分类模型*[J]. 数据分析与知识发现, 2022, 6(2/3): 105-116.
[14] 岳铁骐, 傅友斐, 徐健. 基于招聘广告的岗位人才需求分析框架构建与实证研究*[J]. 数据分析与知识发现, 2022, 6(2/3): 151-166.
[15] 吕璐成, 周健, 王学昭, 刘细文. 基于双层主题模型的技术演化分析框架及其应用*[J]. 数据分析与知识发现, 2022, 6(2/3): 18-32.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn