Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (6): 15-21    DOI: 10.11925/infotech.2096-3467.2019.1332
Current Issue | Archive | Adv Search |
Recommending Domain Knowledge Based on Parallel Collaborative Filtering Algorithm
Yang Heng(),Wang Sili,Zhu Zhongming,Liu Wei,Wang Nan
Literature and Information Center of Northwest Institute of Eco-Environment and Resources,Chinese Academy of Sciences, Lanzhou 730000, China
Download: PDF (773 KB)   HTML ( 34
Export: BibTeX | EndNote (RIS)      

[Objective] This paper tries to identify information needed by the users, and then makes timely and accurate recommendations. [Methods] First, we generated the candidate set through content-based recommendation algorithm and item-based collaborative filtering algorithm. Then, we used parallel MapReduce technique to improve the parallel data mining performance of the proposed method. Finally, we adopted machine learning algorithms to increase the accuracy of recommended candidates and referred, personalized documents to the users. [Results] We created the recommendation list based on articles checked by the individual user. The model’s evaluation accuracy was 78.5%, and its mean squared error was 0.22. [Limitations] The user and text features need to be further investigated. The accuracy of word segmentation and model training algorithm needs to be optimized. [Conclusions] The proposed model generates personalized recommendation lists for users, and provide good support for related services.

Key wordsRecommendation System      Collaborative Filtering      MapReduce      Machine Learning Algorithm     
Received: 31 December 2019      Published: 07 July 2020
ZTFLH:  TP391  
Corresponding Authors: Yang Heng     E-mail:

Cite this article:

Yang Heng,Wang Sili,Zhu Zhongming,Liu Wei,Wang Nan. Recommending Domain Knowledge Based on Parallel Collaborative Filtering Algorithm. Data Analysis and Knowledge Discovery, 2020, 4(6): 15-21.

URL:     OR

Architecture of a Knowledge Recommendation System in the Marine Domain
数据类型 数据属性
用户特征数据 用户ID、用户姓名、用户所属专题
文本特征数据 文本ID、文本标题、文本所属专题、评分
用户行为特征数据 用户ID、被访问文本ID
Classification of Data Categories
Implementation of MapReduce Inversion Collaborative Filtering
第一次MR阶段 第二次MR阶段 第三次MR阶段
Map输入 Map输入 Map输入
{userid,itemid,score} {userid,itemid,score} {itemid,itemid,score}
Map输出 Map输出 Map输出
{itemid,userid,score} {userid,itemid,score} {itemid,itemid,score}
Reduce输入 Reduce输入 Reduce输入
{itemid,userid,score} {userid,itemid,score} {itemid,itemid,score}
Reduce输出 Reduce输出 Reduce输出
{userid,itemid,score} {itemid,itemid,score} {itemid,itemid,score}
Calculation Process of MapReduce Inversion Collaborative Filtering
Data Samples for Model Training
Model Evaluation Results
[1] 翁小兰, 王志坚. 协同过滤推荐算法研究进展[J]. 计算机工程与应用, 2018,54(1):25-31.
[1] ( Weng Xiaolan, Wang Zhijian. Research Process of Collaborative Filtering Recommendation Algorithm[J]. Computer Engineering and Applications, 2018,54(1):25-31.)
[2] 何安. 协同过滤技术在电子商务推荐系统中的应用研究[D]. 杭州:浙江大学, 2007.
[2] ( He An. Research on Collaborative Filtering Technologies of Recommendation System for E-Commerce[D]. Hangzhou: Zhejiang University, 2007.)
[3] 张颖. 基于混合机制的新闻推荐系统研究[D]. 哈尔滨:哈尔滨工业大学, 2015.
[3] ( Zhang Ying. Research on News Recommendation System Based on Hybrid Mechanism[D]. Harbin: Harbin Institute of Technology, 2015.)
[4] Chen H, Li Z, Hu W. An Improved Collaborative Recommendation Algorithm Based on Optimized User Similarity[J]. The Journal of Supercomputing, 2016,72(7):2565-2578.
doi: 10.1007/s11227-015-1518-5
[5] 钱春琳, 张兴芳, 孙丽华. 基于在线评论情感分析的改进协同过滤推荐模型[J]. 山东大学学报:工学版, 2019,49(1):47-54.
[5] ( Qian Chunlin, Zhang Xingfang, Sun Lihua. Advanced Collaborative Filtering Recommendation Model Based on Sentiment Analysis of Online Review[J]. Journal of Shandong University: Engineering Science, 2019,49(1):47-54.)
[6] 杨佳莉, 李直旭, 许佳捷, 等. 一种自适应的混合协同过滤推荐算法[J]. 计算机工程, 2019,45(7):222-228.
doi: 10.19678/j.issn.1000-3428.0051041
[6] ( Yang Jiali, Li Zhixu, Xu Jiajie, et al. An Adaptive Hybrid Collaborative Filtering Recommendation Algorithm[J]. Computer Engineering, 2019,45(7):222-228.)
doi: 10.19678/j.issn.1000-3428.0051041
[7] Zhao W, Wang B, Yang M, et al. Leveraging Long and Short-Term Information in Content-Aware Movie Recommendation via Adversarial Training[J]. IEEE Transactions on Cybernetics. DOI: 10.1109/TCYB.2019.2896766.
doi: 10.1109/TCYB.2020.2997943 pmid: 32584775
[8] Sun F, Liu J, Wu J, et al. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer[OL]. arXiv Preprint, arXiv: 1904. 06690.
[9] 张兴宇. 基于协同过滤和内容过滤的微博话题混合推荐算法[J]. 电脑编程技巧与维护, 2019(3):52-54.
[9] ( Zhang Xingyu. Microblog Topic Hybrid Recommendation Algorithm Based on Collaborative Filtering and Content Filtering[[J]. Computer Programming Skills and Maintenance, 2019(3):52-54.)
[10] Sarwar B, Karypis G, Konstan J, et al. Item-based Collaborative Filtering Recommendation Algorithms [C]//Proceedings of the 10th International Conference on World Wide Web. 2001.
[11] 范志强, 赵文涛. 改进的基于内容的协同过滤电影推荐算法[J]. 信息与电脑:理论版, 2019(13):42-43,47.
[11] ( Fan Zhiqiang, Zhao Wentao. Modified Content-based Collaborative Film Recommendation Algorithms[[J]. Information and Computer: Theoretical Edition, 2019(13):42-43,47. )
[12] 龚科瑜, 张一驰. 基于TF-IDF的古籍文本内容特征提取方法[J]. 电子技术与软件工程, 2019(17):130-131.
[12] ( Gong Keyu, Zhang Yichi. TF-IDF-based Feature Extraction Method for Ancient Text Content[[J]. Electronic Technology & Software Engineering, 2019(17):130-131.)
[13] 刘帝勇, 杨强. 基于机器学习的核电文档个性化推荐系统研究[J]. 电力大数据, 2019,22(9):43-48.
[13] ( Liu Diyong, Yang Qiang. Research on Nuclear Power Document Personalized Recommendation System Based on Machine[J]. Power Systems and Big Data, 2019,22(9):43-48.)
[14] 王卫红, 曾英杰. 基于聚类和用户偏好的协同过滤推荐算法[J]. 计算机工程与应用, 2020,56(3):68-73.
[14] ( Wang Weihong, Zeng Yingjie. Collaborative Filtering Recommendation Algorithm Based on Clustering and User Preference[J]. Computer Engineering and Applications, 2020,56(3):68-73.)
[1] Li Zhenyu, Li Shuqing. Deep Collaborative Filtering Algorithm with Embedding Implicit Similarity Groups[J]. 数据分析与知识发现, 2021, 5(11): 124-134.
[2] Yang Chen, Chen Xiaohong, Wang Chuhan, Liu Tingting. Recommendation Strategy Based on Users’ Preferences for Fine-Grained Attributes[J]. 数据分析与知识发现, 2021, 5(10): 94-102.
[3] Su Qing,Chen Sizhao,Wu Weimin,Li Xiaomei,Huang Tiankuan. Personalized Recommendation Model Based on Collaborative Filtering Algorithm of Learning Situation[J]. 数据分析与知识发现, 2020, 4(5): 105-117.
[4] Zheng Songyin,Tan Guoxin,Shi Zhongchao. Recommending Tourism Attractions Based on Segmented User Groups and Time Contexts[J]. 数据分析与知识发现, 2020, 4(5): 92-104.
[5] Ding Yong,Chen Xi,Jiang Cuiqing,Wang Zhao. Predicting Online Ratings with Network Representation Learning and XGBoost[J]. 数据分析与知识发现, 2020, 4(11): 52-62.
[6] Yan Wen,Lijian Ma,Qingtian Zeng,Wenyan Guo. POI Recommendation Based on Geographic and Social Relationship Preferences[J]. 数据分析与知识发现, 2019, 3(8): 30-39.
[7] Fusen Jiao,Shuqing Li. Collaborative Filtering Recommendation Based on Item Quality and User Ratings[J]. 数据分析与知识发现, 2019, 3(8): 62-67.
[8] Shan Li,Yehui Yao,Hao Li,Jie Liu,Karmapemo. ISA Biclustering Algorithm for Group Recommendation[J]. 数据分析与知识发现, 2019, 3(8): 77-87.
[9] Yiwen Zhang,Chenkun Zhang,Anju Yang,Chengrui Ji,Lihua Yue. A Conditional Walk Quadripartite Graph Based Personalized Recommendation Algorithm[J]. 数据分析与知识发现, 2019, 3(4): 117-125.
[10] Li Jie,Yang Fang,Xu Chenxi. A Personalized Recommendation Algorithm with Temporal Dynamics and Sequential Patterns[J]. 数据分析与知识发现, 2018, 2(7): 72-80.
[11] Wang Daoping,Jiang Zhongyang,Zhang Boqing. Collaborative Filtering Algorithm Based on Gray Correlation Analysis and Time Factor[J]. 数据分析与知识发现, 2018, 2(6): 102-109.
[12] Wang Yong,Wang Yongdong,Guo Huifang,Zhou Yumin. Measuring Item Similarity Based on Increment of Diversity[J]. 数据分析与知识发现, 2018, 2(5): 70-76.
[13] Hua Lingfeng,Yang Gaoming,Wang Xiujun. Recommending Diversified News Based on User’s Locations[J]. 数据分析与知识发现, 2018, 2(5): 94-104.
[14] Liu Dongsu,Huo Chenhui. Recommending Image Based on Feature Matching[J]. 数据分析与知识发现, 2018, 2(3): 49-59.
[15] Xue Fuliang,Liu Junling. Improving Collaborative Filtering Recommendation Based on Trust Relationship Among Users[J]. 数据分析与知识发现, 2017, 1(7): 90-99.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938