Please wait a minute...
Advanced Search
数据分析与知识发现
  本期目录 | 过刊浏览 | 高级检索 |
一种融合表示学习与主题表征的作者合作预测模型
张鑫,文奕,许海云
(中国科学院成都文献情报中心 成都 610041)
(中国科学院大学经济与管理学院图书情报与档案管理系 北京 100190)
A Fusion Model of Network Representation Learning and Topic Model for Author Cooperation Prediction 张鑫,文奕,许海云
Zhang Xin,Wen Yi,Xu Haiyun
(Chengdu Library and Information Center, Chinese Academy of Sciences, Chengdu, 610041, China)
(Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China)
全文: PDF (1220 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的]本文提出了融合网络表示学习和作者主题模型的科研合作预测方法。

[方法]基于经典网络表示学习方法计算得到作者节点的嵌入式向量表示,采用余弦相似度计算作者的结构相似性,基于作者主题模型计算得到作者的主题向量表征,采用Hellinger距离计算作者主题相似性,再将两种相似性方法进行线性特征融合,采用贝叶斯优化方法进行融合超参数选择。

[结果]用NIPS论文数据进行实证研究得到结果,经过贝叶斯参数选择后效果最好的node2vec+ATM模型,预测的AUC值达到了0.9271,比基准模型提高了0.1856,也比现有的一些融合外部信息的表示学习模型高。

[局限]仅考虑了作者文章内容信息,没有将作者的单位、地理位置等更多属性信息融入模型。

[结论]本文提出的融合模型考虑了结构与内容特征,能够得到比简单网络表示学习更好的合作预测效果。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
关键词 网络表示学习作者主题模型模型融合合作预测     
Abstract

[Objective]This paper proposed a method of scientific collaboration prediction combing the fusion network representation learning and author topic model.

[Methods]Based on classical network representation learning method, the embedding vector representation of authors was calculated, and the structural similarity of authors was calculated by cosine similarity. Based on the author-topic model, the topic representation of authors was obtained, and the topic similarity of authors was calculated by Hellinger distance. Then the two similarity measures were fused linearly, and the Bayesian optimization method was used to fuse the hyperparameter selection.

[Results] Empirical research based on the NIPS datasets shows that after Bayesian parameter selection, the node2vec+ATM model achieves the AUC value of 0.9271, which improved 0.1856 than the benchmark model.

[Limitations] This article only considers the content of the author’s publications, but does not incorporate more attribute such as the author’s institution and geographic location into the model.

[Conclusions] The proposed fusion model take structure and content features into consideration and can improve the prediction effect of network representation learning.

Key words Network representation learning    author topic model    model fusion    scientific collaboration prediction
     出版日期: 2020-11-24
ZTFLH:  G350  
引用本文:   
张鑫, 文奕, 许海云. 一种融合表示学习与主题表征的作者合作预测模型 [J]. 数据分析与知识发现, 10.11925/infotech.2096-3467.2020.0515.
Zhang Xin, Wen Yi, Xu Haiyun. A Fusion Model of Network Representation Learning and Topic Model for Author Cooperation Prediction 张鑫,文奕,许海云 . Data Analysis and Knowledge Discovery, 0, (): 1-.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2020.0515      或      http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y0/V/I/1
[1] 余传明,钟韵辞,林奥琛,安璐. 基于网络表示学习的作者重名消歧研究*[J]. 数据分析与知识发现, 2020, 4(2/3): 48-59.
[2] 丁勇,陈夕,蒋翠清,王钊. 一种融合网络表示学习与XGBoost的评分预测模型*[J]. 数据分析与知识发现, 2020, 4(11): 52-62.
[3] 余传明,李浩男,王曼怡,黄婷婷,安璐. 基于深度学习的知识表示研究:网络视角*[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[4] 伍杰华, 朱岸青. 混合拓扑因子的科研网络合作关系预测[J]. 现代图书情报技术, 2015, 31(4): 65-71.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn