Please wait a minute...
Data Analysis and Knowledge Discovery  2022, Vol. 6 Issue (5): 64-76    DOI: 10.11925/infotech.2096-3467.2021.0842
Current Issue | Archive | Adv Search |
Mining Uninteresting Items with Visibility of User Time Points and Collaborative Filtering Recommendation Method
Shi Lei,Li Shuqing()
College of Information Engineering, Nanjing University of Finance & Economics, Nanjing 210023, China
Download: PDF (998 KB)   HTML ( 20
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a new method to improve the collaborative filtering algorithm based on explicit feedbacks, aiming to address data sparsity and user selection bias issues. [Methods] First, we retrieved the negative preferences of users who have seen the items but did not interact with them. Then, we measured the visibility of items along with user activity, item popularity and time factors. Third, we introduced the concept of pre-use preferences to construct a weighted matrix factorization model based on user time point visibility. Finally, we ide.pngied items that users were not interested in, and marked them with low values. [Results] We examined our model with the MovieLens datasets, and found the recommendation accuracy of ItemCF and BiasSVD increased by an average of 2 to 2.5 times. [Limitations] There may be empirical bias in modeling pre-use preferences based on the users’ negative preferences from the “seen-but-not-interacted items”. [Conclusions] The proposed model could effectively reduce the impacts of data sparsity and user selection bias, and make accurate recommendation results.

Key wordsCollaborative Filtering      Explicit Feedback      Selection Bias      Pre-use Preference      Uninteresting Items     
Received: 12 August 2021      Published: 21 June 2022
ZTFLH:  TP393  
Fund:Natural Science Major Foundation of the Jiangsu Higher Education Institutions of China(19KJA510011);Postgraduate Research & Practice Innovation Program of Jiangsu Province(KYCX20_1348)
Corresponding Authors: Li Shuqing,ORCID:0000-0001-9814-5766     E-mail: leeshuqing@163.com

Cite this article:

Shi Lei, Li Shuqing. Mining Uninteresting Items with Visibility of User Time Points and Collaborative Filtering Recommendation Method. Data Analysis and Knowledge Discovery, 2022, 6(5): 64-76.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2021.0842     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2022/V6/I5/64

统计数据 MovieLens 100k MovieLens latest
评分总数 100 000 100 836
用户数量 943 610
项目数量 1 682 9 742
稀疏度 93.7% 98.3%
评分时间跨度 1997.09-1998.04 1996.03-2018.09
Description of Dataset
评分 MovieLens 100k MovieLens latest
低评分(1和2) 17.48% 13.41%
高评分(3、4和5) 82.52% 86.59%
Rating Distribution
评价指标 ItemCF Bias
SVD
EMDP+ ItemCF EMDP+ Bias
SVD
RZF+ ItemCF RZF+ Bias
SVD
P@5 0.061 2 0.064 5 0.053 3 0.061 4 0.142 5 0.140 2
P@10 0.057 4 0.057 3 0.052 3 0.053 4 0.114 3 0.109 0
P@20 0.048 6 0.045 9 0.045 3 0.042 6 0.086 3 0.082 0
R@5 0.046 2 0.056 6 0.048 5 0.053 0 0.159 0 0.166 2
R@10 0.093 8 0.093 1 0.094 5 0.091 9 0.243 6 0.249 6
R@20 0.172 4 0.159 4 0.172 7 0.152 9 0.354 3 0.363 9
NDCG@5 0.071 9 0.077 4 0.063 0 0.076 0 0.193 8 0.189 3
NDCG@10 0.083 8 0.085 1 0.077 9 0.084 6 0.212 6 0.206 8
NDCG@20 0.108 3 0.104 8 0.103 1 0.101 9 0.246 2 0.242 0
Recommendation Accuracy in MovieLens 100k
评价指标 ItemCF Bias
SVD
EMDP+ ItemCF EMDP+ BiasSVD RZF+
ItemCF
RZF+
BiasSVD
P@5 0.042 9 0.046 2 0.025 9 0.028 5 0.095 2 0.105 9
P@10 0.034 6 0.040 1 0.025 2 0.025 0 0.070 9 0.083 8
P@20 0.027 6 0.032 6 0.020 9 0.022 2 0.054 0 0.063 2
R@5 0.029 1 0.033 5 0.022 1 0.021 2 0.080 5 0.093 2
R@10 0.043 5 0.053 2 0.039 6 0.034 2 0.121 8 0.145 9
R@20 0.069 5 0.088 7 0.060 7 0.059 3 0.169 4 0.208 1
NDCG@5 0.051 4 0.054 6 0.035 5 0.034 4 0.117 7 0.132 1
NDCG@10 0.052 1 0.056 9 0.040 2 0.036 1 0.119 6 0.139 9
NDCG@20 0.058 5 0.065 8 0.047 1 0.043 5 0.132 8 0.157 1
Recommendation Accuracy in MovieLens latest
Score Distribution of EMDP Filling
ε
">
Influence of Parameters ε
评价指标 WRMF eALS UTV-eALS
P@5 0.401 7 0.418 4 0.439 0
P@10 0.342 1 0.344 8 0.358 7
P@20 0.272 0 0.268 8 0.287 1
R@5 0.147 1 0.149 4 0.158 2
R@10 0.235 2 0.231 2 0.240 4
R@20 0.350 2 0.340 3 0.358 0
NDCG@5 0.428 2 0.446 8 0.469 7
NDCG@10 0.419 0 0.416 3 0.434 5
NDCG@20 0.417 9 0.408 3 0.431 6
Experimental Results of Pre-use Preference Modeling in MovieLens 100k
评价指标 WRMF eALS UTV-eALS
P@5 0.315 9 0.319 5 0.337 3
P@10 0.272 0 0.270 4 0.281 4
P@20 0.220 6 0.218 4 0.226 4
R@5 0.090 3 0.093 2 0.097 0
R@10 0.152 1 0.151 8 0.154 4
R@20 0.230 2 0.232 9 0.235 2
NDCG@5 0.333 9 0.339 2 0.357 5
NDCG@10 0.314 3 0.312 9 0.328 7
NDCG@20 0.309 8 0.309 3 0.319 4
Experimental Results of Pre-use Preference Modeling in MovieLens latest
使用前偏好 MovieLens 100k MovieLens latest
[0,0.2) 92.64% 97.24%
[0.2,0.4) 5.40% 2.03%
[0.4,0.6) 1.48% 0.52%
[0.6,0.8) 0.38% 0.15%
[0.8,1] 0.10% 0.06%
Pre-use Preference Distribution
θ
">
Influence of Parameters θ
σ
">
Influence of Parameters σ
填充评分 P@5
ItemCF BaisSVD
4 0.092 6 0.086 8
5 0.015 2 0.023 6
Results of High Value Filling
评价指标 ItemCF BiasSVD UIMLF+ItemCF UIMLF+BiasSVD ItemKNN PureSVD UTV-eALS
P@5 0.061 2 0.064 5 0.188 3 0.213 4 0.142 3 0.186 5 0.178 6
P@10 0.057 4 0.057 3 0.145 6 0.161 6 0.118 3 0.144 0 0.139 2
P@20 0.048 6 0.045 9 0.108 4 0.114 5 0.089 1 0.104 8 0.103 1
R@5 0.046 2 0.056 6 0.204 0 0.248 5 0.166 9 0.215 5 0.209 1
R@10 0.093 8 0.093 1 0.316 4 0.345 4 0.268 6 0.315 6 0.310 4
R@20 0.172 4 0.159 4 0.460 1 0.480 2 0.387 8 0.440 7 0.441 5
NDCG@5 0.071 9 0.077 4 0.246 7 0.303 7 0.193 4 0.258 4 0.247 2
NDCG@10 0.083 8 0.085 1 0.267 8 0.319 5 0.219 9 0.276 7 0.270 0
NDCG@20 0.108 3 0.104 8 0.311 9 0.358 1 0.256 8 0.314 6 0.310 8
Experimental Results of Data Filling in MovieLens 100k
评价指标 ItemCF BiasSVD UIMLF+ ItemCF UIMLF+ BiasSVD ItemKNN PureSVD UTV-eALS
P@5 0.042 9 0.046 2 0.152 2 0.168 5 0.116 7 0.150 3 0.144 8
P@10 0.034 6 0.040 1 0.115 9 0.125 1 0.091 9 0.117 4 0.114 4
P@20 0.027 6 0.032 6 0.091 6 0.094 2 0.071 1 0.088 4 0.087 6
R@5 0.029 1 0.033 5 0.113 6 0.131 7 0.105 7 0.122 1 0.118 9
R@10 0.043 5 0.053 2 0.171 4 0.191 5 0.153 8 0.182 4 0.179 7
R@20 0.069 5 0.088 7 0.262 6 0.282 1 0.228 6 0.255 4 0.268 0
NDCG@5 0.051 4 0.054 6 0.183 9 0.204 5 0.150 8 0.189 5 0.182 0
NDCG@10 0.052 1 0.056 9 0.183 0 0.202 8 0.148 2 0.191 3 0.183 3
NDCG@20 0.058 5 0.065 8 0.206 2 0.224 8 0.168 0 0.207 6 0.205 4
Experimental Results of Data Filling in MovieLens latest
[1] Lu J, Wu D S, Mao M S, et al. Recommender System Application Developments: A Survey[J]. Decision Support Systems, 2015, 74: 12-32.
doi: 10.1016/j.dss.2015.03.008
[2] Sarwar B, Karypis G, Konstan J, et al. Item-Based Collaborative Filtering Recommendation Algorithms[C]// Proceedings of the 10th International Conference on World Wide Web. 2001: 285-295.
[3] Marlin B M, Zemel R S. Collaborative Prediction and Ranking with Non-random Missing Data[C]// Proceedings of the 3rd ACM Conference on Recommender Systems. 2009: 5-12.
[4] Marlin B M, Zemel R S, Roweis S, et al. Collaborative Filtering and the Missing at Random Assumption[C]// Proceedings of the 23rd Conference on Uncertainty in A.pngicial Intelligence. 2007: 267-275.
[5] Jawaheer G, Szomszor M, Kostkova P. Comparison of Implicit and Explicit Feedback from an Online Music Recommendation Service[C]// Proceedings of the 1st International Workshop on Information Heterogeneity and Fusion in Recommender Systems. 2010: 47-51.
[6] da Silva J F G, de Moura Junior N N, Caloba L P. Effects of Data Sparsity on Recommender Systems Based on Collaborative Filtering[C]// Proceedings of 2018 International Joint Conference on Neural Networks. 2018: 1-8.
[7] Chen J W, Dong H D, Wang X, et al. Bias and Debias in Recommender System: A Survey and Future Directions[OL]. arXiv Preprint, arXiv: 2010.03240.
[8] Dalvi N, Kumar R, Pang B. Para ‘Normal’ Activity: On the Distribution of Average Ratings[C]// Proceedings of the 7th International AAAI Conference on Weblogs and Social Media. 2013: 110-119.
[9] Hwang W S, Parc J, Kim S W, et al. “Told You I Didn’t Like It”: Exploiting Uninteresting Items for Effective Collaborative Filtering[C]// Proceedings of 2016 IEEE 32nd International Conference on Data Engineering. 2016: 349-360.
[10] Steck H. Evaluation of Recommendations: Rating-Prediction and Ranking[C]// Proceedings of the 7th ACM Conference on Recommender Systems. 2013: 213-220.
[11] Ma H, King I, Lyu M R. Effective Missing Data Prediction for Collaborative Filtering[C]// Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007: 39-46.
[12] Ren Y L, Li G, Zhang J, et al. The Efficient Imputation Method for Neighborhood-Based Collaborative Filtering[C]// Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2012: 684-693.
[13] Chae D K, Kang J S, Kim S W, et al. Rating Augmentation with Generative Adversarial Networks Towards Accurate Collaborative Filtering[C]// Proceedings of the World Wide Web Conference. 2019: 2616-2622.
[14] Pan R, Zhou Y H, Cao B, et al. One-Class Collaborative Filtering[C]// Proceedings of 2008 8th IEEE International Conference on Data Mining. 2008: 502-511.
[15] Rendle S, Freudenthaler C, Gantner Z, et al. BPR: Bayesian Personalized Ranking from Implicit Feedback[C]// Proceedings of the 25th Conference on Uncertainty in A.pngicial Intelligence. 2009: 452-461.
[16] Hu Y F, Koren Y, Volinsky C. Collaborative Filtering for Implicit Feedback Datasets[C]// Proceedings of 2008 8th IEEE International Conference on Data Mining. 2008: 263-272.
[17] He X N, Zhang H W, Kan M Y, et al. Fast Matrix Factorization for Online Recommendation with Implicit Feedback[C]// Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2016: 549-558.
[18] Cremonesi P, Koren Y, Turrin R. Performance of Recommender Algorithms on Top-N Recommendation Tasks[C]// Proceedings of the 4th ACM Conference on Recommender Systems. 2010: 39-46.
[19] Steck H. Training and Testing of Recommender Systems on Data Missing not at Random[C]// Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2010: 713-722.
[20] Liang D W, Charlin L, McInerney J, et al. Modeling User Exposure in Recommendation[C]// Proceedings of the 25th International Conference on World Wide Web. 2016: 549-558.
[21] He X N, Gao M, Kan M Y, et al. Predicting the Popularity of Web 2.0 Items Based on User Comments[C]// Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. 2014: 233-242.
[22] Koren Y, Bell R, Volinsky C. Matrix Factorization Techniques for Recommender Systems[J]. Computer, 2009, 42(8): 30-37.
[1] Li Zhenyu, Li Shuqing. Deep Collaborative Filtering Algorithm with Embedding Implicit Similarity Groups[J]. 数据分析与知识发现, 2021, 5(11): 124-134.
[2] Yang Chen, Chen Xiaohong, Wang Chuhan, Liu Tingting. Recommendation Strategy Based on Users’ Preferences for Fine-Grained Attributes[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[3] Yang Heng,Wang Sili,Zhu Zhongming,Liu Wei,Wang Nan. Recommending Domain Knowledge Based on Parallel Collaborative Filtering Algorithm[J]. 数据分析与知识发现, 2020, 4(6): 15-21.
[4] Su Qing,Chen Sizhao,Wu Weimin,Li Xiaomei,Huang Tiankuan. Personalized Recommendation Model Based on Collaborative Filtering Algorithm of Learning Situation[J]. 数据分析与知识发现, 2020, 4(5): 105-117.
[5] Zheng Songyin,Tan Guoxin,Shi Zhongchao. Recommending Tourism Attractions Based on Segmented User Groups and Time Contexts[J]. 数据分析与知识发现, 2020, 4(5): 92-104.
[6] Ding Yong,Chen Xi,Jiang Cuiqing,Wang Zhao. Predicting Online Ratings with Network Representation Learning and XGBoost[J]. 数据分析与知识发现, 2020, 4(11): 52-62.
[7] Fusen Jiao,Shuqing Li. Collaborative Filtering Recommendation Based on Item Quality and User Ratings[J]. 数据分析与知识发现, 2019, 3(8): 62-67.
[8] Shan Li,Yehui Yao,Hao Li,Jie Liu,Karmapemo. ISA Biclustering Algorithm for Group Recommendation[J]. 数据分析与知识发现, 2019, 3(8): 77-87.
[9] Li Jie,Yang Fang,Xu Chenxi. A Personalized Recommendation Algorithm with Temporal Dynamics and Sequential Patterns[J]. 数据分析与知识发现, 2018, 2(7): 72-80.
[10] Wang Daoping,Jiang Zhongyang,Zhang Boqing. Collaborative Filtering Algorithm Based on Gray Correlation Analysis and Time Factor[J]. 数据分析与知识发现, 2018, 2(6): 102-109.
[11] Wang Yong,Wang Yongdong,Guo Huifang,Zhou Yumin. Measuring Item Similarity Based on Increment of Diversity[J]. 数据分析与知识发现, 2018, 2(5): 70-76.
[12] Hua Lingfeng,Yang Gaoming,Wang Xiujun. Recommending Diversified News Based on User’s Locations[J]. 数据分析与知识发现, 2018, 2(5): 94-104.
[13] Xue Fuliang,Liu Junling. Improving Collaborative Filtering Recommendation Based on Trust Relationship Among Users[J]. 数据分析与知识发现, 2017, 1(7): 90-99.
[14] Qin Xingxin,Wang Rongbo,Huang Xiaoxi,Chen Zhiqun. Slope One Collaborative Filtering Algorithm Based on Multi-Weights[J]. 数据分析与知识发现, 2017, 1(6): 65-71.
[15] Li Daoguo,Li Lianjie,Shen Enping. New Collaborative Filtering Recommendation Algorithm Based on User Rating Time[J]. 现代图书情报技术, 2016, 32(9): 65-69.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn