Please wait a minute...
Advanced Search
数据分析与知识发现  2023, Vol. 7 Issue (5): 48-59     https://doi.org/10.11925/infotech.2096-3467.2022.0424
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于学术知识图谱及主题特征嵌入的论文推荐方法*
李锴君1,牛振东1(),时恺泽1,2,邱萍1
1北京理工大学计算机学院 北京 100081
2悉尼科技大学澳大利亚人工智能研究所 悉尼 2007
Paper Recommendation Based on Academic Knowledge Graph and Subject Feature Embedding
Li Kaijun1,Niu Zhendong1(),Shi Kaize1,2,Qiu Ping1
1School of Computer Science & Technology, Beijing Institute of Technology, Beijing 100081, China
2Australian Artificial Intelligence Institute, University of Technology Sydney, Sydney 2007, Australia
全文: PDF (1112 KB)   HTML ( 18
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 提出一种融合多特征的论文推荐方法,为研究者提供精准的论文推荐服务。【方法】 本文设计了一个特征提取框架,分别从学术论文知识图谱中提取实体关系特征和主题文本特征并融合。为提升高维度融合特征的学习效果,基于知识嵌入的编码-解码模型提出一种论文推荐方法。【结果】 在DBLP-v11数据集上的实验结果表明,所提模型在查准率和MRR上相比次优模型分别提高8.9和2.9个百分点。【局限】 本文的图谱特征学习方法没有考虑实体在现实环境下的权重。【结论】 论文推荐任务的结果证明了所提方法在学习高维度特征中的有效性,对后续研究具有借鉴意义。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
李锴君
牛振东
时恺泽
邱萍
关键词 论文推荐学术论文知识图谱知识嵌入特征融合特征学习    
Abstract

[Objective] This paper proposes a new model that integrates multiple features to provide accurate paper recommendation services for researchers. [Methods] First, we designed a feature extraction framework to extract and fuse entity relation features and topic features from the knowledge graph and the content of academic papers, respectively. Then, we proposed a paper recommendation method based on the knowledge embedding-based encoding-decoding model, which improved the learning effect of high-dimensional fusion features. [Results] We examined our new model on the DBLP-v11 dataset. The proposed method improved the Recall and MRR scores by 8.9% and 2.9%, respectively, compared with the suboptimal model. [Limitations] The proposed graph feature learning method does not consider the weight of entities in the real environment. [Conclusions] The new paper recommendation method could effectively learn high-dimensional features, which provide guidance for subsequent research.

Key wordsPaper Recommendation    Academic Paper Knowledge Graph    Knowledge Embedding    Feature Fusion    Feature Learning
收稿日期: 2022-05-04      出版日期: 2022-07-29
ZTFLH:  TP391  
  G25  
基金资助:*国家重点研发计划项目的研究成果之一(2019YFB1406303)
通讯作者: 牛振东,ORCID:0000-0002-0576-7572,E-mail:zniu@bit.edu.cn。   
引用本文:   
李锴君, 牛振东, 时恺泽, 邱萍. 基于学术知识图谱及主题特征嵌入的论文推荐方法*[J]. 数据分析与知识发现, 2023, 7(5): 48-59.
Li Kaijun, Niu Zhendong, Shi Kaize, Qiu Ping. Paper Recommendation Based on Academic Knowledge Graph and Subject Feature Embedding. Data Analysis and Knowledge Discovery, 2023, 7(5): 48-59.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.0424      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2023/V7/I5/48
Fig.1  KE-EDM模型框架
符号 描述 数量
P 学术论文 696 131
V 出版机构 4 083
Y 年份 5
A f 工作机构 600 266
A u 作者 1 015 371
共计 2 315 856
Table 1  实体介绍
符号 描述
A u P 作者写论文
P V 论文刊载出版机构
A u A f 作者所在工作机构
P Y 学术论文出版时间
P P 论文引用论文
A u A u 作者与作者合作
Table 2  关系介绍
样本数据举例或基本描述
论文ID 1000018889
题目 Remote Policy Enforcement ···Execution in
Mobile Environments
作者信息 作者ID、姓名以及隶属工作机构
出版机构 出版机构 ID 以及名称
年份 2013
参考论文 本篇论文引用的论文 ID 组成的列表
摘要 Both in ···viableand effective.
Table 3  数据集元数据
Top N KE-EDM (本文) HKE-ARNN Citeomatic ClusCite VOPRec
查准率 查全率 F1
分数
查准率 查全率 F1
分数
查准率 查全率 F1
分数
查准率 查全率 F1
分数
查准率 查全率 F1
分数
1 0.686 0.337 0.452 0.654 0.062 0.113 0.469 0.039 0.072 0.412 0.019 0.036 0.313 0.053 0.091
5 0.516 0.406 0.454 0.511 0.162 0.246 0.348 0.140 0.200 0.352 0.152 0.212 0.187 0.160 0.172
10 0.265 0.463 0.337 0.281 0.291 0.286 0.294 0.235 0.261 0.242 0.212 0.226 0.159 0.280 0.203
20 0.150 0.682 0.247 0.160 0.593 0.252 0.222 0.354 0.273 0.195 0.300 0.236 0.135 0.410 0.203
50 0.064 0.705 0.117 0.067 0.596 0.120 0.139 0.539 0.221 0.111 0.427 0.176 0.059 0.460 0.105
Table 4  KE-EDM模型与基线模型的性能
Fig.2  各模型的MRR分数
主题特征 Precision@20 Recall@20 F1分数 MRR
标题&摘要 0.151 0.682 0.247 0.687
标题 0.150 0.655 0.244 0.659
摘要 0.151 0.662 0.245 0.686
Table 5  不同主题特征的嵌入对模型性能的影响分析
Fig.3  不同维度的向量特征对模型推荐性能的影响
方法 Precision@20 Recall@20 F1分数 MRR
TransD 0.151 0.682 0.247 0.687
TransR 0.150 0.675 0.245 0.689
TransH 0.146 0.664 0.239 0.656
TransE 0.144 0.650 0.235 0.649
Table 6  不同图谱特征表示学习方法对模型性能的影响分析
[1] Liu X Z, Yu Y Y, Guo C, et al. Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation[C]// Proceedings of the 23rd ACM International Conference on Information and Knowledge Management. 2014: 121-130.
[2] Wu T, Liu Z W, Huang Q Q, et al. Adversarial Robustness under Long-Tailed Distribution[C]// Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 8655-8664.
[3] Beel J, Gipp B, Langer S, et al. Research-Paper Recommender Systems: A Literature Survey[J]. International Journal on Digital Libraries, 2016, 17(4): 305-338.
doi: 10.1007/s00799-015-0156-0
[4] Zhang Y, Yang L B, Cai X Y, et al. A Novel Personalized Citation Recommendation Approach Based on GAN[C]// Proceedings of International Symposium on Methodologies for Intelligent Systems. 2018: 268-278.
[5] Goyal P, Ferrara E. Graph Embedding Techniques, Applications, and Performance: A Survey[J]. Knowledge-Based Systems, 2018, 151: 78-94.
doi: 10.1016/j.knosys.2018.03.022
[6] 黄璐, 林川杰, 何军, 等. 融合主题模型和协同过滤的多样化移动应用推荐[J]. 软件学报, 2017, 28(3): 708-720.
[6] (Huang Lu, Lin Chuanjie, He Jun, et al. Diversified Mobile App Recommendation Combining Topic Model and Collaborative Filtering[J]. Journal of Software, 2017, 28(3): 708-720.)
[7] Zhu Y F, Lin Q K, Lu H, et al. Recommending Scientific Paper via Heterogeneous Knowledge Embedding Based Attentive Recurrent Neural Networks[J]. Knowledge-Based Systems, 2021, 215: 106744.
doi: 10.1016/j.knosys.2021.106744
[8] Dai Z H, Yang Z L, Yang Y M, et al. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 6000-6010.
[9] Vaswani A, Shazeer N, Parmar N, et al. Attention is All You Need[C]// Proceedings of the 31st Conference on Neural Information Processing Systems. 2017.
[10] Ji G L, He S Z, Xu L H, et al. Knowledge Graph Embedding via Dynamic Mapping Matrix[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long papers). 2015: 687-696.
[11] Tang T Y, McCalla G. A Multidimensional Paper Recommender: Experiments and Evaluations[J]. IEEE Internet Computing, 2009, 13(4): 34-41.
[12] Gori M, Pucci A. Research Paper Recommender Systems: A Random-Walk Based Approach[C]// Proceedings of 2006 IEEE/WIC/ACM International Conference on Web Intelligence. 2006: 778-781.
[13] Beel J, Langer S, Genzmehr M, et al. Research Paper Recommender System Evaluation: A Quantitative Literature Survey[C]// Proceedings of the International Workshop on Reproducibility and Replication in Recommender Systems Evaluation. 2013: 15-22.
[14] Ma S T, Zhang C Z, Liu X Z. A Review of Citation Recommendation: From Textual Content to Enriched Context[J]. Scientometrics, 2020, 122(3): 1445-1472.
doi: 10.1007/s11192-019-03336-0
[15] Yang L B, Zheng Y, Cai X Y, et al. A LSTM Based Model for Personalized Context-Aware Citation Recommendation[J]. IEEE Access, 2018, 6: 59618-59627.
doi: 10.1109/ACCESS.2018.2872730
[16] Nascimento C, Laender A H F, da Silva A S, et al. A Source Independent Framework for Research Paper Recommendation[C]// Proceedings of the 11th Annual International ACM/IEEE Joint Conference on Digital Libraries. 2011: 297-306.
[17] Amami M, Pasi G, Stella F, et al. An LDA-Based Approach to Scientific Paper Recommendation[C]// Proceedings of International Conference on Applications of Natural Language to Information Systems. 2016: 200-210.
[18] Achakulvisut T, Acuna D E, Ruangrong T, et al. Science Concierge: A Fast Content-Based Recommendation System for Scientific Publications[J]. PLoS One, 2016, 11(7): e0158423.
doi: 10.1371/journal.pone.0158423
[19] Bhagavatula C, Feldman S, Power R, et al. Content-Based Citation Recommendation[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, Volume 1 (Long Papers). 2018: 238-251.
[20] Beel J, Langer S, Genzmehr M, et al. Introducing Docear′s Research Paper Recommender System[C]// Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries. 2013: 459-460.
[21] Gazdar A, Hidri L. A New Similarity Measure for Collaborative Filtering Based Recommender Systems[J]. Knowledge-Based Systems, 2020, 188: 105058.
doi: 10.1016/j.knosys.2019.105058
[22] Wang C, Blei D M. Collaborative Topic Modeling for Recommending Scientific Articles[C]// Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2011: 448-456.
[23] Bansal T, Belanger D, McCallum A. Ask the GRU: Multi-task Learning for Deep Text Recommendations[C]// Proceedings of the 10th ACM Conference on Recommender Systems. 2016: 107-114.
[24] Sugiyama K, Kan M Y. Exploiting Potential Citation Papers in Scholarly Paper Recommendation[C]// Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries. 2013: 153-162.
[25] Son J, Kim S B. Academic Paper Recommender System Using Multilevel Simultaneous Citation Networks[J]. Decision Support Systems, 2018, 105: 24-33.
doi: 10.1016/j.dss.2017.10.011
[26] Lops P, Gemmis M, Semeraro G. Content-Based Recommender Systems:State of the Art and Trends[A]// Recommender Systems Handbook[M]. Cham: Springer, 2011: 73-105.
[27] Ali Z, Qi G L, Kefalas P, et al. A Graph-Based Taxonomy of Citation Recommendation Models[J]. Artificial Intelligence Review, 2020, 53(7): 5217-5260.
doi: 10.1007/s10462-020-09819-4
[28] Tian G, Jing L P. Recommending Scientific Articles Using Bi-relational Graph-Based Iterative RWR[C]// Proceedings of the 7th ACM Conference on Recommender Systems. 2013: 399-402.
[29] Chakraborty T, Modani N, Narayanam R, et al. DiSCern: A Diversified Citation Recommendation System for Scientific Queries[C]// Proceedings of 2015 IEEE 31st International Conference on Data Engineering. 2015: 555-566.
[30] Xia F, Liu H F, Lee I, et al. Scientific Article Recommendation: Exploiting Common Author Relations and Historical Preferences[J]. IEEE Transactions on Big Data, 2016, 2(2): 101-112.
doi: 10.1109/TBDATA.2016.2555318
[31] Manju G, Abhinaya P, Hemalatha M R, et al. Cold Start Problem Alleviation in A Research Paper Recommendation System Using the Random Walk Approach on A Heterogeneous User­Paper Graph[J]. International Journal of Intelligent Information Technologies, 2020, 16 (2): 24-48.
doi: 10.4018/IJIIT
[32] Huang W Y, Wu Z H, Liang C, et al. A Neural Probabilistic Model for Context Based Citation Recommendation[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2015, 29: 2404-2410.
[33] Jiang Z R, Liu X Z, Gao L C. Chronological Citation Recommendation with Information­Need Shifting[C]// Proceedings of the 24th ACM International Conference on Information and Knowledge Management. 2015: 1291-1300.
[34] Le Q V, Mikolov T. Distributed Representations of Sentences and Documents[C]// Proceedings of the 31st International Conference on Machine Learning. 2014: 1188-1196.
[35] Cho K, van Merrienboer B, Gulcehre C, et al. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 2014: 1724-1734.
[36] Ren X, Liu J L, Yu X, et al. ClusCite: Effective Citation Recommendation by Information Network-Based Clustering[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014: 821-830.
[37] Kong X J, Mao M Y, Wang W, et al. VOPRec: Vector Representation Learning of Papers with Text Information and Structural Identity for Recommendation[J]. IEEE Transactions on Emerging Topics in Computing, 2021, 9(1): 226-237.
doi: 10.1109/TETC.6245516
[1] 潘华莉, 谢珺, 高婧, 续欣莹, 王长征. 融合多模态特征的深度强化学习推荐模型*[J]. 数据分析与知识发现, 2023, 7(4): 114-128.
[2] 邓娜, 何昕洋, 陈伟杰, 陈旭. MPMFC:一种融合网络邻里结构特征和专利语义特征的中药专利分类模型*[J]. 数据分析与知识发现, 2023, 7(4): 145-158.
[3] 杨文丽, 李娜娜. 基于对抗网络的文本对齐跨语言情感分类方法*[J]. 数据分析与知识发现, 2022, 6(7): 141-151.
[4] 肖悦珺, 李红莲, 张乐, 吕学强, 游新冬. 特征融合的中文专利文本分类方法研究*[J]. 数据分析与知识发现, 2022, 6(4): 49-59.
[5] 胡忠义,张硕果,吴江. 基于URL多粒度特征融合的钓鱼网站识别*[J]. 数据分析与知识发现, 2022, 6(11): 103-110.
[6] 谢星雨, 余本功. 基于MFFMB的电商评论文本分类研究*[J]. 数据分析与知识发现, 2022, 6(1): 101-112.
[7] 陈杰,马静,李晓峰. 融合预训练模型文本特征的短文本分类方法*[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[8] 徐月梅, 王子厚, 吴子歆. 一种基于CNN-BiLSTM多特征融合的股票走势预测模型*[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[9] 张国标,李洁. 融合多模态内容语义一致性的社交媒体虚假新闻检测*[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[10] 孟镇,王昊,虞为,邓三鸿,张宝隆. 基于特征融合的声乐分类研究*[J]. 数据分析与知识发现, 2021, 5(5): 59-70.
[11] 王雨竹,谢珺,陈波,续欣莹. 基于跨模态上下文感知注意力的多模态情感分析 *[J]. 数据分析与知识发现, 2021, 5(4): 49-59.
[12] 林克柔,王昊,龚丽娟,张宝隆. 融合多特征的中文论文同名学者消歧研究 *[J]. 数据分析与知识发现, 2021, 5(4): 90-102.
[13] 韩普, 张伟, 张展鹏, 王宇欣, 方浩宇. 基于特征融合和多通道的突发公共卫生事件微博情感分析*[J]. 数据分析与知识发现, 2021, 5(11): 68-79.
[14] 李军莲,吴英杰,邓盼盼,冷伏海. 基于特征融合的引文失范数据自动处理策略研究*[J]. 数据分析与知识发现, 2020, 4(5): 38-45.
[15] 祁瑞华,简悦,郭旭,关菁华,杨明昕. 融合特征与注意力的跨领域产品评论情感分析*[J]. 数据分析与知识发现, 2020, 4(12): 85-94.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn