Based on the introduction of basic collaborative filtering algorithm, six kinds of techniques which are used to ameliorate the scalability problem are generalized, including clustering, probabilistic approach, dimensionality reduction, item-based, dataset reduction and linear model. The collaborative filtering algorithms with aforementioned techniques are commented emphatically, and their ideas are summarized in two points: reducing the neighborhood search space under the precondition of unaffected recommendation quality; periodically running user similarity measuring and neighborhood research offline to reduce the recommendation computation online. Two future research directions on the scalability problem in collaborative filtering are discussed finally, namely the collaborative filtering algorithm based on distributed structure, and the neighborhood search based on formal concept analysis.
李聪. 电子商务协同过滤可扩展性研究综述[J]. 现代图书情报技术, 2010, 26(11): 37-41.
Li Cong. Review of Scalability Problem in E-commerce Collaborative Filtering. New Technology of Library and Information Service, 2010, 26(11): 37-41.
[4] Karypis G. Evaluation of Item-based Top-n Recommendation Algorithms [C]. In: Proceedings of the 10th International Conference on Information and Knowledge Management. New York: ACM Press, 2001: 247-254.
[8] Rashid A M, Lam S K, Karypis G, et al. ClustKNN: A Highly Scalable Hybrid Model- & Memory-based CF Algorithm [C]. In: Proceedings of the KDD Workshop on Web Mining and Web Usage Analysis. 2006.
[9] Linden G, Smith B, York J. Amazon.com Recommendations: Item-to-item Collaborative Filtering [J]. IEEE Internet Computing, 2003, 7(1): 76-80.
[10] Sarwar B M, Karypis G, Konstan J A, et al. Application of Dimensionality Reduction in Recommender System—A Case Study [C]. In: Proceedings of ACM Web KDD Workshop. Minneapolis: University of Minnesota, 2000.
[11] Sarwar B M, Karypis G, Konstan J, et al. Recommender Systems for Large-scale E-commerce: Scalable Neighborhood Formation Using Clustering [C]. In: Proceedings of the 5th International Conference on Computer and Information Technology. 2002.
[12] Chee S H S, Han J, Wang K. RecTree: An Efficient Collaborative Filtering Method [C]. In: Proceedings of the 3rd International Conference on Data Warehousing and Knowledge Discovery. London: Springer-Verlag, 2001:141-151.
[13] Kelleher J, Bridge D. RecTree Centroid: An Accurate, Scalable Collaborative Recommender [C]. In: Proceedings of the 14th Irish Conference on Artificial Intelligence and Cognitive Science. 2003:89-94.
[14] Xue G, Lin C, Yang Q, et al. Scalable Collaborative Filtering Using Cluster-based Smoothing [C]. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 2005:114-121.
[15] O’Conner M, Herlocker J. Clustering Items for Collaborative Filtering [C]. In: Proceedings of the ACM SIGIR Workshop on Recommender Systems. 1999.
[17] Kim B M, Li Q. Probabilistic Model Estimation for Collaborative Filtering Based on Items Attributes [C]. In: Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence. Washington, DC: IEEE Computer Society Press, 2004:185-191.
[18] Kohrs A, Merialdo B. Clustering for Collaborative Filtering Applications [C]. In: Proceedings of the International Conference on Computational Intelligence for Modelling Control and Automation. Amsterdam, Netherlands: IOS Press, 1999:199-204.
[19] Castro P A D, Franca F O. Evaluating the Performance of a Biclustering Algorithm Applied to Collaborative Filtering—A Comparative Analysis [C]. In: Proceedings of the 7th International Conference on Hybrid Intelligent Systems. Washington, DC: IEEE Computer Society Press, 2007:65-70.
[20] George T, Merugu S. A Scalable Collaborative Filtering Framework Based on Co-clustering [C]. In: Proceedings of the 5th IEEE International Conference on Data Mining. Washington, DC: IEEE Computer Society Press, 2005:625-628.
[23] Hofmann T. Collaborative Filtering via Gaussian Probabilistic Latent Semantic Analysis [C]. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 2003:259-266.
[24] Hoffman T. Latent Semantic Models for Collaborative Filtering [J]. ACM Transactions on Information Systems, 2004, 22(1): 89-115.
[27] Breese J S, Heckerman D, Kadie C. Empirical Analysis of Predictive Algorithms for Collaborative Filtering [R]. Redmond: Microsoft Research, 1998.
[28] Pennock D M, Horvitz E, Lawrence S, et al. Collaborative Filtering by Personality Diagnosis: A Hybrid Memory-and Model-based Approach [C]. In: Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence. San Francisco: Morgan Kaufmann Publishers, 2000: 473-480.
[29] Zeng C, Xing C, Zhou L. Similarity Measure and Instance Selection for Collaborative Filtering [C]. In: Proceedings of the 12th International Conference on World Wide Web. New York: ACM Press, 2003:652-658.
[30] Rennie J D M, Srebro N. Fast Maximum Margin Matrix Factorization for Collaborative Prediction [C]. In: Proceedings of the 22nd International Conference on Machine Learning. New York: ACM Press, 2005:713-719.
[31] Wu M. Collaborative Prediction via Ensembles of Matrix Factorizations [C]. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2007:43-47.
[32] Chen G, Wang F, Zhang C. Collaboratice Filtering Using Orthogonal Nonnegative Matrix Tri-factorization [J]. Information Processing and Management, 2009, 45(3): 368-379.
[33] Billsus D, Pazzani M J. Learning Collaborative Information Filters [C]. In: Proceedings of the 15th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 1998:46-54.
[34] Sarwar B, Karypis G, Konstan J, et al. Incremental SVD-based Algorithms for Highly Scaleable Recommender Systems [C]. In: Proceedings of the 5th International Conference on Computer and Information Technology. 2002.
[35] Goldberg K, Roeder T, Gupta D, et al. Eigentaste: A Constant Time Collaborative Filtering Algorithm [J]. Information Retrieval, 2001, 4(2): 133-151.
[36] Kim D, Yum B J. Collaborative Filtering Based on Iterative Principal Component Analysis [J]. Expert Systems with Applications, 2005, 28(4): 823-830.
[37] Honda K, Ichihashi H. Component-wise Robust Linear Fuzzy Clustering for Collaborative Filtering [J]. International Journal of Approximate Reasoning, 2004, 37(2): 127-144.
[39] Sarwar B, Karypis G, Konstan J, et al. Item-based Collaborative Filtering Recommendation Algorithms [C]. In: Proceedings of the 10th International Conference on World Wide Web. New York: ACM Press, 2001:285-295.
[40] Sarwar B M, Konstan J A, Borchers A, et al. Using Filtering Agents to Improve Prediction Quality in the GroupLens Research Collaborative Filtering System [C]. In: Proceedings of the 1998 ACM Conference on Computer Supported Cooperative Work. New York: ACM Press, 1998:345-354.
[41] Yu K, Xu X, Ester M, et al. Feature Weighting and Instance Selection for Collaborative Filtering: An Information-theoretic Approach [J]. Knowledge and Information systems, 2003, 5(2): 201-224.
[42] Lemire D, Maclachlan A. Slope One Predictors for Online Rating-based Collaborative Filtering [C]. In: Proceedings of the 5th SIAM International Conference on Data Mining. 2005:471-476.
[43] Boucher-Ryan P D, Bridge D. Collaborative Recommending Using Formal Concept Analysis [J].Knowledge-Based Systems, 2006, 19(5): 309-315.