Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (3): 88-100    DOI: 10.11925/infotech.2096-3467.2020.0515
A Prediction Model with Network Representation Learning and Topic Model for Author Collaboration
Zhang Xin1,Wen Yi1,2(),Xu Haiyun1,2
1Chengdu Library and Information Center, Chinese Academy of Sciences, Chengdu 610041, China
2Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
Abstract

[Objective] This paper proposes a method to predict scientific collaboration based on the network representation learning and author topic model. [Methods] First, we established the embedding vector representation of authors with the help of network representation learning method. Then, we calculated the structural similarity of authors with cosine similarity. Third, we obtained the topic representation of authors with the author-topic model, and computed the authors’ topic similarity with Hellinger distance. Finally, we linearly merged the two similarity measures, and used the Bayesian optimization method for the hyperparameter selection. [Results] We examined the proposed method with the NIPS datasets and found the best node2vec+ATM model after Bayesian parameter selection. It had an AUC value of 0.9271, which was 0.1856 higher than that of the benchmark model. [Limitations] We did not include the author’s institution and geographic location to the model. [Conclusions] The proposed model utilizes structure and content features to improve the prediction results of network representation learning.

Received: 03 June 2020      Published: 24 November 2020
Fund:National Natural Science Foundation of China(71704170);nformatization Project of the Chinese Academy of Sciences(XXH13506-203);Youth Innovation Promotion Association of the Chinese Academy of Sciences(2016159)
