Please wait a minute...
Advanced Search
数据分析与知识发现  2021, Vol. 5 Issue (8): 54-64     https://doi.org/10.11925/infotech.2096-3467.2021.0102
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于作者偏好和异构信息网络的科技文献推荐方法研究*
王勤洁,秦春秀(),马续补,刘怀亮,徐存真
西安电子科技大学经济与管理学院 西安 710126
Recommending Scientific Literature Based on Author Preference and Heterogeneous Information Network
Wang Qinjie,Qin Chunxiu(),Ma Xubu,Liu Huailiang,Xu Cunzhen
School of Economics & Management, Xidian University, Xi'an 710126, China
全文: PDF (869 KB)   HTML ( 4
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 采用异构信息网络理论和作者偏好,提高科技文献推荐质量。【方法】 基于异构信息网络理论,提出一种可以融合多语义信息的科技文献推荐方法。首先,结合作者偏好信息为科技文献异构信息网络中的元路径加权;其次,采用DPRel算法计算作者与文献之间的相关度。在此基础上,构建加权作者-文献矩阵,按相关度降序排列得到推荐列表。【结果】 从Web of Science中收集实验数据集,实验结果表明,在三个数据集中所提方法相较于基于单条元路径计算作者-文献相关度的推荐方法在平均成功推荐率上分别提高了6%、8%、6%,并且文献成功推荐提高率分别为14.8%、27.6%、13.0%。【局限】 在数据预处理阶段由人工进行关键词统一,对于海量数据,人工处理关键词不现实。【结论】 所提推荐方法提高了异构信息网络中科技文献推荐的质量。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
王勤洁
秦春秀
马续补
刘怀亮
徐存真
关键词 科技文献推荐异构信息网络作者偏好元路径加权    
Abstract

[Objective] This study uses heterogeneous information network and author preference to improve the performance of scientific literature recommendation. [Methods] We proposed a new method using various semantic information. Firstly, we weighted the meta path in the heterogeneous information network of the scientific literature with the help of the author preference. Secondly, we used the DPRel algorithm to calculate the correlation between the author and the literature. Finally, we constructed the weighted author-literature matrix, and retrieved the recommendation list based on the descending order of the correlation. [Results] We examined our model with data sets from the Web of Science. Compared with the methods of single meta path, the average successful recommendation rate of the new algorithm was 6%, 8% and 6% higher in three datasets. The improvement rate of successful recommendation was 14.8%, 27.6% and 13.0%, respectively. [Limitations] In data preprocessing stage, the keywords were unified manually, which is unrealistic for massive data sets. [Conclusions] The proposed method could effectively improve the quality of scientific literature recommendation.

Key wordsScientific Literature Recommendation    Heterogeneous Information Network    Author Preference    Meta Path Weighting
收稿日期: 2021-02-01      出版日期: 2021-09-15
ZTFLH:  TP393 G250  
基金资助:*国家自然科学基金项目(71573199)
通讯作者: 秦春秀 ORCID:0000-0002-7809-4145     E-mail: cxqin@xidian.edu.cn
引用本文:   
王勤洁, 秦春秀, 马续补, 刘怀亮, 徐存真. 基于作者偏好和异构信息网络的科技文献推荐方法研究*[J]. 数据分析与知识发现, 2021, 5(8): 54-64.
Wang Qinjie, Qin Chunxiu, Ma Xubu, Liu Huailiang, Xu Cunzhen. Recommending Scientific Literature Based on Author Preference and Heterogeneous Information Network. Data Analysis and Knowledge Discovery, 2021, 5(8): 54-64.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2021.0102      或      http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2021/V5/I8/54
Fig.1  基于作者偏好和异构信息网络的文献推荐方法思路
Fig.2  科技文献网络示例[24]
Fig.3  科技文献网络模式示例[24]
元路径 元路径含义
P1=APTP 与某作者发表的论文含相同关键词的论文
P2=APJP 与某作者发表的论文在同一个期刊的论文
Table 1  所选元路径
邻接矩阵 WC1矩阵
大小
WC2矩阵
大小
WC3矩阵
大小
作者-文献(AP 622×200 723×200 971×200
关键词-文献(TP 885×200 861×200 768×200
期刊-文献(JP 30×200 51×200 16×200
文献-关键词(PT 200×885 200×861 200×768
文献-期刊(PJ 200×30 200×51 200×16
Table 2  数据预处理得到的邻接矩阵
作者

文献
1 2 3 4 5 6 7 8
Dwivedi, Y K 0.002 8 0 0 0 0 0 0 0.026 7
Rana, N P 0.001 9 0 0 0 0 0 0 0.028 3
Hamari, J 0.005 2 0 0.120 3 0 0 0 0.027 8 0
Haustein, S 0 0 0.010 7 0 0.004 8 0 0 0
Lariviere, V 0 0 0.018 5 0 0.035 6 0 0 0
Chang, V 0.036 6 0 0 0 0 0 0 0
Alalwan, A A 0.002 5 0 0 0 0 0 0 0
Gani, A 0.004 8 0 0 0 0 0 0 0
Hong, I B 0.004 8 0 0.060 4 0 0 0 0 0
Benitez-Amado,J 0.002 7 0 0 0 0 0 0 0
Nambisan, S 0 0 0.043 5 0 0 0 0.268 8 0.008 0
Tsou, A 0 0 0.016 2 0 0.039 3 0 0 0
Venkatesh, V 0 0 0 0 0 0 0.005 7 0.005 7
Moody, G D 0 0 0 0 0 0 0.004 8 0.004 8
Ahmed, E 0.004 8 0 0 0 0 0 0 0
Table 3  作者-文献相关性矩阵(部分)
作者

文献
Top-1 Top-2 Top-3 Top-4 Top-5 Top-6 Top-7 Top-8 Top-9 Top-10
Dwivedi, Y K 106 85 72 35 52 28 79 106 85 72
Lariviere, V 114 89 10 88 14 198 110 114 89 10
Gani, A 15 80 65 147 57 189 69 15 80 65
Weitzel, T 40 44 125 116 84 145 27 40 44 125
Dou, Y F 162 22 42 164 1 15 28 162 22 42
Hashem, I A T 15 80 65 147 57 189 69 15 80 65
Savova, G 101 126 171 155 25 172 76 101 126 171
Bornmann, L 132 198 168 10 88 14 105 132 198 168
Siponen, M 47 33 78 8 71 183 157 47 33 78
Ngangue, P 50 24 171 169 164 148 79 50 24 171
Nikfarjam, A 23 172 126 101 50 171 155 23 172 126
Ohno-Machado,L 20 199 36 107 62 65 23 20 199 36
Venkatesh, V 18 71 17 169 168 155 148 18 71 17
Ruan, X Y 126 101 171 155 172 25 36 126 101 171
Table 4  推荐文献列表(部分)
作者 引用的推荐文献 引用频次
Dwivedi,Y K 1. An empirical validation of a unified model of electronic government adoption UMEGA 23
2. Factors influencing adoption of mobile banking by Jordanian bank customers: Extending UTAUT2 with trust 11
3. Factors affecting adoption of online banking: A meta-analytic structural equation modeling study 2
4. Citizen′s adoption of an e-government system: Validating extended social cognitive theorySCT 6
5. Social media in marketing: A review and analysis of the existing literature 14
6. Consumer adoption of mobile banking in Jordan Examining the role of usefulness, ease of use, perceived risk and self-efficacy 4
7. Acceptance and use predictors of open data technologies: Drawing upon the unified theory of acceptance and use of technology 3
8. Acceptance of mobile banking framework in Pakistan 2
9. Co-citation and cluster analyses of extant literature on social networks 5
10. A generalised adoption model for services: A cross-country comparison of mobile health m-health 15
Hashem,
I A T
1. The role of big data in smart city 2
2. Big data: From beginning to future 1
3. Blockchain′s roles in strengthening cybersecurity and protecting privacy 1
4. Beyond the hype: Big data concepts, methods, and analytics 3
Bornmann, L 1. Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references 2
2. Do "altmetrics" correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary 2
3. The journal coverage of Web of Science and Scopus: a comparative analysis 1
4. Google Scholar, Scopus and the Web of Science: a longitudinal and cross-disciplinary comparison 2
Table 5  作者对推荐的文献引用情况(部分)
分组 实验组 基线组
推荐成功的人数 推荐
成功率
推荐成功的人数 推荐
成功率
第1组 6 60% 5 50%
第2组 5 50% 4 40%
第3组 7 70% 6 60%
第4组 6 60% 6 60%
第5组 5 50% 5 50%
平均推荐成功率 58% 52%
Table 6  WC1数据集推荐成功率
分组 实验组 基线组
推荐成功的人数 推荐
成功率
推荐成功的人数 推荐
成功率
第1组 6 60% 6 60%
第2组 7 70% 6 60%
第3组 5 50% 4 40%
第4组 6 60% 6 60%
第5组 6 60% 4 40%
平均推荐成功率 60% 52%
Table 7  WC2数据集推荐成功率
分组 实验组 基线组
推荐成功的人数 推荐
成功率
推荐成功的人数 推荐
成功率
第1组 7 70% 6 60%
第2组 8 80% 8 80%
第3组 8 80% 6 60%
第4组 7 70% 7 70%
第5组 8 80% 8 80%
平均推荐成功率 76% 70%
Table 8  WC3数据集推荐成功率
分组 WC1 WC2 WC3
实验组 基线组 实验组 基线组 实验组 基线组
第1组 9篇 7篇 20篇 17篇 17篇 15篇
第2组 5篇 4篇 14篇 8篇 19篇 17篇
第3组 12篇 10篇 11篇 10篇 18篇 15篇
第4组 22篇 21篇 19篇 15篇 14篇 13篇
第5组 14篇 12篇 10篇 8篇 10篇 9篇
平均成功推荐文献篇数 12.4篇 10.8篇 14.8篇 11.6篇 15.6篇 13.8篇
文献成功推荐提高率 14.8% 27.6% 13.0%
Table 9  三个数据集中每组推荐成功的文献数
[1] 李楚桐, 莫赞. 基于协同过滤算法的推荐系统研究[J]. 信息通信, 2018(2):38-39.
[1] ( Li Chutong, Mo Zan. Research on Recommendation System Based on Collaborative Filtering Algorithm[J]. Information & Communications, 2018(2):38-39.)
[2] 刘旭晖. 融合主题多样性与影响力的科技文献推荐算法研究[J]. 情报理论与实践, 2017, 40(12):134-138.
[2] ( Liu Xuhui. Research on Scientific and Technical Literature Recommendation Algorithm Based on Topic Diversity and Influence[J]. Information Studies: Theory & Application, 2017, 40(12):134-138.)
[3] 古迎志, 董诚, 裴兵兵, 等. 基于术语抽取与分级匹配的项目指南推荐方法[J]. 情报工程, 2018, 4(3):58-66.
[3] ( Gu Yingzhi, Dong Cheng, Pei Bingbing, et al. The Recommendation Approach Based on Term Extraction and Graduation Matching[J]. Technology Intelligence Engineering, 2018, 4(3):58-66.)
[4] Christakopoulou E, Karypis G. Local Item-Item Models for Top-N Recommendation[C]// Proceedings of the 10th ACM Conference on Recommender Systems. 2016: 67-74.
[5] Wang Z Y, Liu Y, Yang J J, et al. A Personalization-Oriented Academic Literature Recommendation Method[J]. Data Science Journal, 2015, 14. DOI: 10.5334/dsj-2015-017.
doi: 10.5334/dsj-2015-017
[6] 刘佳奇, 王全民. 基于改进的用户协同过滤算法的高校个性化图书推荐系统[J]. 计算机与数字工程, 2020, 48(10):2458-2461, 2479.
[6] ( Liu Jiaqi, Wang Quanmin. College Personalized Book Recommendation System Based on Improved User Collaborative Filtering Algorithm[J]. Computer & Digital Engineering, 2020, 48(10):2458-2461, 2479.)
[7] Pan L L, Dai X Y, Huang S J, et al. Academic Paper Recommendation Based on Heterogeneous Graph[C]// Proceedings of the 14th China National Conference, CCL 2015 and 3rd International Symposium, NLP-NABD 2015. 2015:381-392.
[8] 吴燎原, 蒋军, 王刚. 科研社交网络中基于联合概率矩阵分解的科技论文推荐方法研究[J]. 计算机科学, 2016, 43(9):213-217.
[8] ( Wu Liaoyuan, Jiang Jun, Wang Gang. Study of Scientific Paper Recommendation Method Based on Unified Probabilistic Matrix Factorization in Scientific Social Networks[J]. Computer Science, 2016, 43(9):213-217.)
[9] 张力. 科技论文推荐算法研究[D]. 北京: 北京邮电大学, 2017.
[9] ( Zhang Li. Research on Recommendation Algorithm of Scientific Papers[D]. Beijing: Beijing University of Posts and Telecommunications, 2017.)
[10] 张琪, 章颖华. 情境感知的科技文献协同推荐方法研究[J]. 现代图书情报技术, 2012(2):10-17.
[10] ( Zhang Qi, Zhang Yinghua. Research on an Approach of Context Aware Collaborative Recommend for Scientific & Technical Literatures[J]. New Technology of Library and Information Service, 2012(2):10-17.)
[11] 朱祥, 张云秋, 惠秋悦. 基于学科异构知识网络的学术文献推荐方法研究[J]. 图书馆杂志, 2020, 39(8):103-110.
[11] ( Zhu Xiang, Zhang Yunqiu, Hui Qiuyue. An Academic Literature Recommendation Method Based on Disciplinary Heterogeneous Knowledge Network[J]. Library Journal, 2020, 39(8):103-110.)
[12] 赵传, 张凯涵, 梁吉业. 非对称的异质信息网络推荐算法[J]. 计算机科学与探索, 2020, 14(6):939-946.
[12] ( Zhao Chuan, Zhang Kaihan, Liang Jiye. Asymmetric Recommendation Algorithm in Heterogeneous Information Network[J]. Journal of Frontiers of Computer Science & Technology, 2020, 14(6):939-946.)
[13] 刘云枫, 孙平, 葛志远. 异构信息网络推荐研究进展[J]. 情报科学, 2020, 38(6):151-157.
[13] ( Liu Yunfeng, Sun Ping, Ge Zhiyuan. Literature Review of Heterogeneous Information Network Recommendation[J]. Information Science, 2020, 38(6):151-157.)
[14] 孙艺洲, 韩家炜. 异构信息网络挖掘:原理和方法[M]. 段磊, 朱敏, 唐常杰, 译. 北京: 机械工业出版社, 2016.
[14] ( Sun Yizhou, Han Jiawei. Mining Heterogeneous Information Networks: Principles and Methodologies[M]. Translated by Duan Lei, Zhu Min, Tang Changjie. Beijing, China Machine Press, 2016.)
[15] Suo X T, Wei F, Yu K. Entity Recommendation via Integrating Multiple Types of Implicit Feedback in Heterogeneous Information Network[C]// Proceedings of 2017 IEEE International Conference on Data Mining Workshops. 2017: 781-786.
[16] 王永贵, 梅轩玮. 非对称异构信息网络的模糊推荐算法[J]. 计算机工程与应用, 2020, 56(23):74-79.
[16] ( Wang Yonggui, Mei Xuanwei. Fuzzy Recommendation Algorithm for Asymmetric Heterogeneous Information Networks[J]. Computer Engineering and Applications, 2020, 56(23):74-79.)
[17] Vahedian F, Burke R, Mobasher B. Weighted Random Walks for Meta-Path Expansion in Heterogeneous Networks[C]// Proceedings of the 10th ACM Conference on Recommender Systems. 2016:15-19.
[18] Zhang M X, Wang J H, Wang W. HeteRank: A General Similarity Measure in Heterogeneous Information Networks by Integrating Multi-type Relationships[J]. Information Sciences, 2018, 453:389-407.
doi: 10.1016/j.ins.2018.04.022
[19] Gupta M, Kumar P. Recommendation Generation Using Personalized Weight of Meta-paths in Heterogeneous Information Networks[J]. European Journal of Operational Research, 2020, 284(2):660-674.
[20] 王根生, 潘方正. 融合加权异构信息网络的矩阵分解推荐算法[J]. 数据分析与知识发现, 2020, 4(12):76-84.
[20] ( Wang Gensheng, Pan Fangzheng. Matrix Factorization Algorithm for Weighted Heterogeneous Information Networks[J]. Data Analysis and Knowledge Discovery, 2020, 4(12):76-84)
[21] 张海霞, 吕振, 张传亭, 等. 一种引入加权异构信息的改进协同过滤推荐算法[J]. 电子科技大学学报, 2018, 47(1):112-116, 152.
[21] ( Zhang Haixia, Lv Zhen, Zhang Chuanting, et al. An Improved Collaborative Filtering Recommendation Algorithm with Weighted Heterogeneous Information[J]. Journal of University of Electronic Science and Technology of China, 2018, 47(1):112-116, 152.)
[22] Shi C, Li Y T, Zhang J W, et al. A Survey of Heterogeneous Information Network Analysis[J]. IEEE Transactions on Knowledge & Data Engineering, 2017, 29(1):17-37.
[23] Sun Y Z, Han J W. Meta-Path-Based Search and Mining in Heterogeneous Information Networks[J]. Tsinghua Science and Technology, 2013, 18(4):329-338.
doi: 10.1109/TST.2013.6574671
[24] Gupta M, Kumar P, Bhasker B. HeteClass: A Meta-path Based Framework for Transductive Classification of Objects in Heterogeneous Information Networks[J]. Expert Systems with Applications, 2017, 68:106-122.
doi: 10.1016/j.eswa.2016.10.013
[25] Sun Y Z, Han J W, Yan X F, et al. PathSim: Meta Path-based Top-K Similarity Search in Heterogeneous Information Networks[J]. Proceedings of the VLDB Endowment, 2011, 4(11):992-1003.
doi: 10.14778/3402707.3402736
[26] Christakis N A, Fowler J H. Social Contagion Theory:Examining Dynamic Social Networks and Human Behavior[J]. Statistics in Medicine, 2013, 32(4):556-577.
doi: 10.1002/sim.5408 pmid: 22711416
[27] 徐红艳, 王丹, 王富海, 等. 融合潜在狄利克雷分布与元路径分析的用户相关性度量方法[J]. 计算机应用, 2019, 39(11):3288-3292.
[27] ( Xu Hongyan, Wang Dan, Wang Fuhai, et al. User Relevance Measure Method Combining Latent Dirichlet Allocation and Meta-Path Analysis[J]. Journal of Computer Applications, 2019, 39(11):3288-3292.)
[28] Gupta M, Kumar P, Bhasker B. DPRel: A Meta-Path Based Relevance Measure for Mining Heterogeneous Networks[J]. Information Systems Frontiers, 2019, 21(5):979-995.
doi: 10.1007/s10796-017-9811-x
[1] 王根生,潘方正. 融合加权异构信息网络的矩阵分解推荐算法*[J]. 数据分析与知识发现, 2020, 4(12): 76-84.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn