Please wait a minute...
Advanced Search
现代图书情报技术  2015, Vol. 31 Issue (9): 9-16     https://doi.org/10.11925/infotech.1003-3513.2015.09.02
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
融合主题模型及多时间节点函数的用户兴趣预测研究
桂思思1, 陆伟1,2, 黄诗豪1, 周鹏程1
1 武汉大学信息管理学院 武汉 430072;
2 武汉大学信息资源研究中心 武汉 430072
User Interest Prediction Combing Topic Model and Multi-time Function
Gui Sisi1, Lu Wei1,2, Huang Shihao1, Zhou Pengcheng1
1 School of Information Management, Wuhan University, Wuhan 430072, China;
2 Center for the Studies of Information Resources, Wuhan University, Wuhan 430072, China
全文: PDF (457 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的]针对用户兴趣随时间推移不断变化的问题, 利用主题模型及时间节点函数预测用户兴趣。[方法]使用主题模型生成用户兴趣, 针对用户的所有兴趣, 分别利用多时间节点函数对每个兴趣的每次出现进行加权, 用以预测用户兴趣在下一个时间节点的分布情况。[结果]在Sogou搜索日志上, 与基于记忆的用户兴趣模型、基于遗忘曲线的用户兴趣度多阶段量化模型进行对比实验, 余弦相似度及KL(Kullback-Leibler)距离均表明本文方法能较准确地预测用户兴趣。[局限]仅在Sogou搜索日志上进行实验测试, 还需在其他数据集上进一步检验。[结论]充分考虑用户历史数据中每一个时间点可更准确地对用户兴趣进行预测。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
Abstract

[Objective] User interest is not static and it changes dynamically as time goes by, this paper proposes a user interest prediction model based on topic model and multi-time function. [Methods] Generate user interests by topic model, and calculate the weights of each user interest at every time point by applying multi-time function in order to predict user interest at next time point. [Results] Compared with memory-based user profile model and multi-step user profile model, cosine similarity and Kullback-Leibler divergence of the experimental results on search engine log data provided by Sogou Lab show that this model can predict user interests more effectively. [Limitations] The proposed method is only tested on search engine log data provided by Sogou Lab, and it need further examination on other data sets. [Conclusions] It is more effective to take every time point of user history data into consideration for user interest prediction.

收稿日期: 2015-04-03      出版日期: 2016-04-06
:  TP393  
基金资助:

本文系教育部人文社会科学基地重大项目“面向细粒度的网络信息检索模型及框架构建研究”(项目编号:10JJD630014)和国家自然科学基金面上项目“面向词汇功能的学术文本语义识别与知识图谱构建”(项目编号:71473183)的研究成果之一。

通讯作者: 陆伟, ORCID: 0000-0002-0929-7416, E-mail: weilu@whu.edu.cn。     E-mail: weilu@whu.edu.cn
作者简介: 作者贡献声明:桂思思:提出研究命题,设计实施方案,数据分析处理,论文起草与修订;陆伟:设计研究方案,论文最终版本修订;黄诗豪:Sogou数据集预处理,使用主题模型生成用户兴趣;周鹏程:在Sogou数据集上实现基于记忆的用户兴趣模型、基于遗忘曲线的用户兴趣度多阶段量化模型。
引用本文:   
桂思思, 陆伟, 黄诗豪, 周鹏程. 融合主题模型及多时间节点函数的用户兴趣预测研究[J]. 现代图书情报技术, 2015, 31(9): 9-16.
Gui Sisi, Lu Wei, Huang Shihao, Zhou Pengcheng. User Interest Prediction Combing Topic Model and Multi-time Function. New Technology of Library and Information Service, 2015, 31(9): 9-16.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2015.09.02      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2015/V31/I9/9

[1] 冯子威. 用户兴趣建模的研究[D]. 哈尔滨: 哈尔滨工业大学, 2010. (Feng Ziwei. Research on User Interests Modeling [D]. Harbin: Harbin Institute of Technology, 2010.)
[2] 杨杰, 陈恩红. 面向个性化服务的用户兴趣偏移检测及处理方法[J]. 电子技术, 2009(11): 72-76, 63. (Yang Jie, Chen Enhong. Personalized Service Oriented User Interest Shift Detection and Processing [J]. Electronic Technology, 2009(11):
72-76, 63.)
[3] Ahmed A, Low Y, Aly M, et al. Scalable Distributed Inference of Dynamic User Interests for Behavioral Targeting [C]. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2011: 114-122.
[4] Veningston K, Shanmugalakshmi R. Combining User Interested Topic and Document Topic for Personalized Information Retrieval [A]. //Big Data Analytics [M]. Springer International Publishing, 2014: 60-79.
[5] Sakamoto S, Mikawa K, Goto M. A Study on Recommender System Based on Latent Class Model for High Dimensional and Sparse Data [C]. In: Proceedings of the 14th Asia Pacific Industrial Engineering and Management Society Conference, Cebu, Philippines. 2013.
[6] Pennacchiotti M, Gurumurthy S. Investigating Topic Models for Social Media User Recommendation [C]. In: Proceedings of the 20th International Conference Companion on World Wide Web. ACM, 2011: 101-102.
[7] Liu Q, Chen E H, Xiong H, et al. Enhancing Collaborative Filtering by User Interest Expansion via Personalized Ranking [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2012, 42(1): 218-233.
[8] Mao Q, Feng B, Pan S. Modeling User Interests Using Topic Model [J]. Journal of Theoretical and Applied Information Technology, 2013, 48(1): 600-606.
[9] Ding W, Chen C. Dynamic Topic Detection and Tracking: A Comparison of HDP, C-word, and Cocitation Methods [J]. Journal of the Association for Information Science and Technology, 2014, 65(10): 2084-2097.
[10] Lee T Q, Park Y, Park Y T. A Time-Based Approach to Effective Recommender Systems Using Implicit Feedback [J]. Expert Systems with Applications, 2008, 34(4): 3055-3062.
[11] Lee T Q, Park Y, Park Y T. An Empirical Study on Effectiveness of Temporal Information as Implicit Ratings [J]. Expert Systems with Applications, 2009, 36(2): 1315-1321.
[12] Widmer G, Kubat M. Learning in the Presence of Concept Drift and Hidden Contexts [J]. Machine Learning, 1996, 23(1): 69-101.
[13] 郝水龙, 吴共庆, 胡学钢. 基于层次向量空间模型的用户兴趣表示及更新[J]. 南京大学学报:自然科学版, 2012, 48(2): 190-197. (Hao Shuilong, Wu Gongqing, Hu Xuegang. Presentation and Updation for User Profile Based on Hierarchical Vector Space Model [J]. Journal of Nanjing University: Natural Sciences, 2012, 48(2):190-197.)
[14] 宋丽哲, 牛振东, 余正涛, 等. 一种基于混合模型的用户兴趣漂移方法[J]. 计算机工程, 2006, 32(1): 4-6,89. (Song Lizhe, Niu Zhendong, Yu Zhengtao. A Method of Drifting User's Interests Based on Hybrid Model [J]. Computer Engineering, 2006, 32(1): 4-6,89.)
[15] 布红艳, 王国胤, 董振兴. 邮件系统中的兴趣漂移混合模型[J]. 计算机工程与设计, 2011, 32(12): 4026-4029. (Bu Hongyan, Wang Guoyin, Dong Zhenxing. Hybrid Interest Drifting Model of E-mail Systems [J]. Computer Engineering and Design, 2011,32(12): 4026-4029.)
[16] Maloof M A, Michalski R S. Selecting Examples for Partial Memory Learning [J]. Machine Learning, 2000, 41(1): 27-52.
[17] Koychev I. Gradual Forgetting for Adaptation to Concept Drift [C]. In: Proceedings of ECAI 2000 Workshop on Current Issues in Spatio-Temporal Reasoning, Berlin, Germany. 2000.
[18] Koychev I, Schwab I. Adaptation to Drifting User's Interests [C]. In: Proceedings of ECML2000 Workshop: Machine Learning in New Information Age. 2000: 39-46.
[19] Chen Z, Jiang Y, Zhao Y. A Collaborative Filtering Recommendation Algorithm Based on User Interest Change and Trust Evaluation [J]. International Journal of Digital Content Technology and Its Applications, 2010, 4(9): 106-113.
[20] Zheng N, Li Q. A Recommender System Based on Tag and Time Information for Social Tagging Systems [J]. Expert Systems with Applications, 2011, 38(4): 4575-4587.
[21] Zhang Y, Liu Y. A Collaborative Filtering Algorithm Based on Time Period Partition [C]. In: Proceedings of the 3rd International Symposium on Intelligent Information Technology and Security Informatics, Jinggangshan, China. IEEE, 2010: 777-780.
[22] Karahodza B, Supic H, Donko D. An Approach to Design of Time-Aware Recommender System Based on Changes in Group User's Preferences [C]. In: Proceedings of the 2014 X International Symposium on Telecommunications. IEEE, 2014: 1-4.
[23] Wang Q, Sun M, Xu C. An Improved User-Model-Based Collaborative Filtering Algorithm [J]. Journal of Information and Computational Science, 2011, 8(10): 1837-1846.
[24] 邢春晓, 高凤荣, 战思南, 等. 适应用户兴趣变化的协同过滤推荐算法[J]. 计算机研究与发展, 2007, 44(2): 296-301. (Xing Chunxiao, Gao Fengrong, Zhan Sinan, et al. A Collaborative Filtering Recommendation Algorithm Incorporated with User Interest Change [J]. Journal of Computer Research and Development, 2007, 44(2): 296-301.)
[25] 于洪, 李转运. 基于遗忘曲线的协同过滤推荐算法[J]. 南京大学学报:自然科学版, 2010, 46(5): 520-527. (Yu Hong, Li Zhuanyun. A Collaborative Filtering Recommendation Algorithm Based on Forgetting Curve [J]. Journal of Nanjing University: Natural Sciences, 2010, 46(5): 520-527.)
[26] Wu Y K, Wang Y, Tang Z H. A Collaborative Filtering Recommendation Algorithm Based on Interest Forgetting Curve [J]. International Journal of Advancements in Computing Technology, 2012, 4(10): 148-157.
[27] Liu K, Chen W, Bu J, et al. User Modeling for Recommendation in Blogspace [C]. In: Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology Workshops. IEEE, 2007: 79-82.
[28] Cheng Y, Qiu G, Bu J, et al. Model Bloggers' Interests Based on Forgetting Mechanism [C]. In: Proceedings of the 17th International Conference on World Wide Web. ACM, 2008: 1129-1130.
[29] Rybak J, Balog K, Nørvåg K. Temporal Expertise Profiling [C]. In: Proceedings of the 36th European Conference on IR Research, Amsterdam, Netherlands. 2014: 540-546.
[30] Wu D, Zhao D, Zhang X. An Adaptive User Profile Based on Memory Model [C]. In: Proceedings of the 9th International Conference on Web-Age Information Management. IEEE, 2008: 461-468.
[31] Wang W, Zhao D, Luo H, et al. Mining User Interests in Web Logs of an Online News Service Based on Memory Model [C]. In: Proceedings of the 8th International Conference on Networking, Architecture and Storage. IEEE, 2013: 151-155.
[32] 于洪涛, 崔瑞飞, 董芹芹. 基于遗忘曲线的微博用户兴趣模型[J]. 计算机工程与设计, 2014, 35(10): 3367-3372, 3379. (Yu Hongtao, Cui Ruifei, Dong Qinqin. Micro-Blog User Interest Model Based on Forgetting Curve [J]. Computer Engineering and Design, 2014, 35(10): 3367-3372, 3379.)
[33] Hofmann T. Probabilistic Latent Semantic Indexing [C]. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1999: 50-57.
[34] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[35] 崔凯. 基于LDA的主题演化研究与实现[D]. 长沙: 国防科学技术大学, 2010. (Cui Kai. The Research and Implementation of Topic Evolution on LDA [D]. Changsha: National University of Defense Technology, 2010.)
[36] Ding Y, Li X. Time Weight Collaborative Filtering [C]. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management. ACM, 2005: 485-492.
[37] Cao J, Xia T, Li J, et al. A Density-Based Method for Adaptive LDA Model Selection [J]. Neurocomputing, 2009, 72(7-9): 1775-1781.
[38] Kullback S, Leibler R A. On Information and Sufficiency [J]. The Annals of Mathematical Statistics, 1951,22(1): 79-86.
[39] Jeong D H, Song M. Time Gap Analysis by the Topic Model-Based Temporal Technique [J]. Journal of Informetrics, 2014, 8(3): 776-790.
[40] Newman D, Asuncion A U, Smyth P, et al. Distributed Algorithms for Topic Models [J]. Journal of Machine Learning Research, 2009, 10: 1801-1828.

[1] 陈杰,马静,李晓峰. 融合预训练模型文本特征的短文本分类方法*[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2] 李文娜,张智雄. 基于置信学习的知识库错误检测方法研究*[J]. 数据分析与知识发现, 2021, 5(9): 1-9.
[3] 孙羽, 裘江南. 基于网络分析和文本挖掘的意见领袖影响力研究 [J]. 数据分析与知识发现, 0, (): 1-.
[4] 王勤洁, 秦春秀, 马续补, 刘怀亮, 徐存真. 基于作者偏好和异构信息网络的科技文献推荐方法研究*[J]. 数据分析与知识发现, 2021, 5(8): 54-64.
[5] 李文娜, 张智雄. 基于联合语义表示的不同知识库中的实体对齐方法研究*[J]. 数据分析与知识发现, 2021, 5(7): 1-9.
[6] 王昊, 林克柔, 孟镇, 李心蕾. 文本表示及其特征生成对法律判决书中多类型实体识别的影响分析[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[7] 杨晗迅, 周德群, 马静, 罗永聪. 基于不确定性损失函数和任务层级注意力机制的多任务谣言检测研究*[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
[8] 徐月梅, 王子厚, 吴子歆. 一种基于CNN-BiLSTM多特征融合的股票走势预测模型*[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[9] 黄名选,蒋曹清,卢守东. 基于词嵌入与扩展词交集的查询扩展*[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[10] 王晰巍,贾若男,韦雅楠,张柳. 多维度社交网络舆情用户群体聚类分析方法研究*[J]. 数据分析与知识发现, 2021, 5(6): 25-35.
[11] 阮小芸,廖健斌,李祥,杨阳,李岱峰. 基于人才知识图谱推理的强化学习可解释推荐研究*[J]. 数据分析与知识发现, 2021, 5(6): 36-50.
[12] 刘彤,刘琛,倪维健. 多层次数据增强的半监督中文情感分析方法*[J]. 数据分析与知识发现, 2021, 5(5): 51-58.
[13] 陈文杰,文奕,杨宁. 基于节点向量表示的模糊重叠社区划分算法*[J]. 数据分析与知识发现, 2021, 5(5): 41-50.
[14] 张国标,李洁. 融合多模态内容语义一致性的社交媒体虚假新闻检测*[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[15] 闫强,张笑妍,周思敏. 基于义原相似度的关键词抽取方法 *[J]. 数据分析与知识发现, 2021, 5(4): 80-89.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn