Please wait a minute...
Advanced Search
现代图书情报技术  2016, Vol. 32 Issue (6): 73-79     https://doi.org/10.11925/infotech.1003-3513.2016.06.09
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于项目概率分布的协同过滤推荐算法*
王永1(),邓江洲1,邓永恒1,张璞2
1重庆邮电大学电子商务与现代物流重点实验室 重庆 400065
2重庆邮电大学计算机学院 重庆 400065
A Collaborative Filtering Recommendation Algorithm Based on Item Probability Distribution
Wang Yong1(),Deng Jiangzhou1,Deng Yongheng1,Zhang Pu2
1Key Laboratory of Electronic Commerce and Logistics, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2College of Computer Science, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
全文: PDF (923 KB)   HTML ( 60
输出: BibTeX | EndNote (RIS)      
摘要 

目的】解决传统项目相似性度量方法必须依赖于共同评分项, 及传统方法在稀疏数据集中预测准确性不高的问题。【方法】将信号处理领域的KL散度引入项目相似性的计算中, 利用评分值的概率密度分布计算项目相似性, 可更有效地发现目标项目的相似邻居项目。【结果】在MovieLens数据集上的实验结果表明, 该算法的推荐综合值F1超过0.65, 在预测有效性、预测误差和推荐准确性等方面的评测结果均明显优于当前常用的项目相似性方法。【局限】只考虑了项目评分值的比率, 未充分利用项目的绝对评分值。【结论】算法有效地利用了数据集内的评分信息, 较好地克服了数据的稀疏性问题, 具有很好的应用价值。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
王永
邓江洲
邓永恒
张璞
关键词 项目相似性协同过滤KL散度推荐算法    
Abstract

[Objective] This study tries to reduce the reliance of co-rated items in the traditional item similarity measurements and then improve the prediction precision of the sparse datasets. [Methods] First, we modified the Kullback-Leibler (KL) divergence from the signal processing domain to compute item similarities. Second, we calculated the similarity with the help of density distribution of ratings, and then found the neighboring items more effectively. [Results] We examined the proposed algorithm on MovieLens and the achieved F1 measure value was over 0.65. The accuracy, efficiency and error rates of the new prediction mechanism were much better than traditional item similarity measurements. [Limitations] The proposed algorithm considered the density of ratings, however, it did not utilize the absolute value of item ratings. [Conclusions] The proposed algorithm effectively uses the rating information to address the sparse dataset issue. Thus, it has strong potentiality in practice.

Key wordsItem similarity    Collaborative filtering    Kullback-Leibler divergence    Recommendation algorithm
收稿日期: 2016-01-26      出版日期: 2016-07-18
基金资助:*本文系国家自然科学基金项目“结合知识图谱的概率话题模型研究”(项目编号:61502066)和重庆市基础与前沿项目“面向产品评论的细粒度观点挖掘方法研究”(项目编号: cstc2015jcyjA40025)的研究成果之一
引用本文:   
王永,邓江洲,邓永恒,张璞. 基于项目概率分布的协同过滤推荐算法*[J]. 现代图书情报技术, 2016, 32(6): 73-79.
Wang Yong,Deng Jiangzhou,Deng Yongheng,Zhang Pu. A Collaborative Filtering Recommendation Algorithm Based on Item Probability Distribution. New Technology of Library and Information Service, 2016, 32(6): 73-79.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2016.06.09      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2016/V32/I6/73
[1] Goldberg D, Nichols D, Oki B M, et al.Using Collaborative Filtering to Weave an Information Tapestry[J]. Communications of the ACM, 1992, 35(12): 61-70.
[2] Zheng N, Li Q, Liao S, et al.Which Photo Groups Should I Choose a Comparative Study of Recommendation Algorithms in Flickr[J]. Journal of Information Science, 2010, 36(6): 733-750.
[3] Brynjolfsson E, Hu Y J, Smith M D.Consumer Surplus in the Digital Economy: Estimating the Value of Increased Product Variety at Online Booksellers[J]. Management Science, 2003, 49(11): 1580-1596.
[4] Breese J, Hecherman D, Kadie C.Empirical Analysis of Predictive Algorithms for Collaborative Filtering [C]. In: Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, 1998.
[5] Xu C, Xu J, Du X.Recommendation Algorithm Combining the User-based Classified Regression and the Item-based Filtering [C]. In: Proceedings of the International Conference on Electronic Commerce: The New E-commerce-Innovations for Conquering Current Barriers, Obstacles and Limitations to Conducting Successful Business on the Internet, Fredericton, New Brunswick, Canada. 2006: 574-578.
[6] Arwar B, Karypls G, Konstan J, et al.Item-based Collaborative Filtering Recommendation Algorithms [C]. In: Proceedings of the 10th International World Wide Web Conference. 2001.
[7] Kim B M, Li Q, Park C S, et al.A New Approach for Combining Content-based and Collaborative Filters[J]. Journal of Intelligent Information System, 2006, 27(1): 79-91.
[8] Karypis G.Evaluation of Item-based Top-N Recommendation Algorithms[C]. In: Proceedings of the 10th International Conference on Information and Knowledge Management. 2001.
[9] Deng A, Zhu Y, Shi B.A Collaborative Filtering Recommendation Algorithm Based on Item Rating Prediction[J]. Journal of Software, 2003, 14(9): 1621-1628.
[10] Luo H, Niu C, Shen R, et al.A Collaborative Filtering Framework Based on both Local User Similarity and Global User Similarity[J]. Machine Learning, 2008,72(3): 231-245.
[11] Ahn H J.A New Similarity Measure for Collaborative Filtering to Alleviate the New User Cold-Starting Problem[J]. Information Sciences, 2008, 178(1): 37-51.
[12] Bobadilla J, Ortega F, Hernando A, et al.A Collaborative Filtering Approach to Mitigate the New User Cold Start Problem[J]. Knowledge-Based Systems, 2012, 26: 225-238.
[13] Koutrica G, Bercovitz B, Garcia H.FlexRecs: Expressing and Combining Flexible Recommendations [C]. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. 2009.
[14] Cacheda F, Carneiro V, Fernández D, et al.Comparison of Collaborative Filtering Algorithms: Limitations of Current Techniques and Proposals for Scalable, High-Performance Recommender System[J]. ACM Transactions on the Web, 2011, 5(1): 1-33.
[15] Patra B K, Launonen R, Ollikainen V, et al.Exploiting Bhattacharyya Similarity Measure to Diminish User Cold- start Problem in Sparse Data [A]. // Discovery Science [M]. Springer International Publishing, 2014: 252-263.
[16] Kullback S, Leibler R A.On Information and Sufficiency[J]. The Annals of Mathematical Statistics, 1951, 22(1): 79-86.
[17] Huang A.Similarity Measures for Text Document Clustering [C]. In: Proceedings of the 6th New Zealand Computer Science Research Student Conference. 2008.
[1] 马莹雪,甘明鑫,肖克峻. 融合标签和内容信息的矩阵分解推荐方法*[J]. 数据分析与知识发现, 2021, 5(5): 71-82.
[2] 李振宇, 李树青. 嵌入隐式相似群的深度协同过滤算法*[J]. 数据分析与知识发现, 2021, 5(11): 124-134.
[3] 杨辰, 陈晓虹, 王楚涵, 刘婷婷. 基于用户细粒度属性偏好聚类的推荐策略*[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[4] 杨恒,王思丽,祝忠明,刘巍,王楠. 基于并行协同过滤算法的领域知识推荐模型研究*[J]. 数据分析与知识发现, 2020, 4(6): 15-21.
[5] 苏庆,陈思兆,吴伟民,李小妹,黄佃宽. 基于学习情况协同过滤算法的个性化学习推荐模型研究*[J]. 数据分析与知识发现, 2020, 4(5): 105-117.
[6] 郑淞尹,谈国新,史中超. 基于分段用户群与时间上下文的旅游景点推荐模型研究*[J]. 数据分析与知识发现, 2020, 4(5): 92-104.
[7] 张纯金,郭盛辉,纪淑娟,杨伟,伊磊. 基于多属性评分隐表征学习的群组推荐算法*[J]. 数据分析与知识发现, 2020, 4(12): 120-135.
[8] 王根生,潘方正. 融合加权异构信息网络的矩阵分解推荐算法*[J]. 数据分析与知识发现, 2020, 4(12): 76-84.
[9] 丁勇,陈夕,蒋翠清,王钊. 一种融合网络表示学习与XGBoost的评分预测模型*[J]. 数据分析与知识发现, 2020, 4(11): 52-62.
[10] 焦富森,李树青. 基于物品质量和用户评分修正的协同过滤推荐算法 *[J]. 数据分析与知识发现, 2019, 3(8): 62-67.
[11] 李珊,姚叶慧,厉浩,刘洁,嘎玛白姆. 基于ISA联合聚类的组推荐算法研究 *[J]. 数据分析与知识发现, 2019, 3(8): 77-87.
[12] 丁勇,程璐,蒋翠清. 基于二部图的P2P网络借贷投资组合决策方法 *[J]. 数据分析与知识发现, 2019, 3(12): 76-83.
[13] 李杰, 杨芳, 徐晨曦. 考虑时间动态性和序列模式的个性化推荐算法*[J]. 数据分析与知识发现, 2018, 2(7): 72-80.
[14] 王道平, 蒋中杨, 张博卿. 基于灰色关联分析和时间因素的协同过滤算法*[J]. 数据分析与知识发现, 2018, 2(6): 102-109.
[15] 王永, 王永东, 郭慧芳, 周玉敏. 一种基于离散增量的项目相似性度量方法*[J]. 数据分析与知识发现, 2018, 2(5): 70-76.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn