|
|
A Prediction Model with Network Representation Learning and Topic Model for Author Collaboration |
Zhang Xin1,Wen Yi1,2( ),Xu Haiyun1,2 |
1Chengdu Library and Information Center, Chinese Academy of Sciences, Chengdu 610041, China 2Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China |
|
|
Abstract [Objective] This paper proposes a method to predict scientific collaboration based on the network representation learning and author topic model. [Methods] First, we established the embedding vector representation of authors with the help of network representation learning method. Then, we calculated the structural similarity of authors with cosine similarity. Third, we obtained the topic representation of authors with the author-topic model, and computed the authors’ topic similarity with Hellinger distance. Finally, we linearly merged the two similarity measures, and used the Bayesian optimization method for the hyperparameter selection. [Results] We examined the proposed method with the NIPS datasets and found the best node2vec+ATM model after Bayesian parameter selection. It had an AUC value of 0.9271, which was 0.1856 higher than that of the benchmark model. [Limitations] We did not include the author’s institution and geographic location to the model. [Conclusions] The proposed model utilizes structure and content features to improve the prediction results of network representation learning.
|
Received: 03 June 2020
Published: 24 November 2020
|
|
Fund:National Natural Science Foundation of China(71704170);nformatization Project of the Chinese Academy of Sciences(XXH13506-203);Youth Innovation Promotion Association of the Chinese Academy of Sciences(2016159) |
Corresponding Authors:
Wen Yi
E-mail: wenyi@clas.ac.cn
|
[1] |
Newman M E J. Coauthorship Networks and Patterns of Scientific Collaboration[J]. Proceedings of the National Academy of the United States of America, 2004,101(S1):5200-5205.
|
[2] |
Liben‐Nowell D, Kleinberg J. The Link‐Prediction Problem for Social Networks[J]. Journal of the American Society for Information Science and Technology, 2007,58(7):1019-1031.
|
[3] |
吕琳媛. 复杂网络链路预测[J]. 电子科技大学学报, 2010,39(5):651-661.
|
[3] |
( Lv Linyuan. Link Prediction in Complex Networks[J]. Journal of University of Electronic Science and Technology of China, 2010,39(5):651-661.)
|
[4] |
Guns R, Rousseau R. Recommending Research Collaborations Using Link Prediction and Random Forest Classifiers[J]. Scientometrics, 2014,101(2):1461-1473.
|
[5] |
Yan E, Guns R. Predicting and Recommending Collaborations: An Author-, Institution-, and Country-Level Analysis[J]. Journal of Informetrics, 2014,8(2):295-309.
|
[6] |
汪志兵, 韩文民, 孙竹梅, 等. 基于网络拓扑结构与节点属性特征融合的科研合作预测研究[J]. 情报理论与实践, 2019,42(8):116-120, 109.
|
[6] |
( Wang Zhibing, Han Wenmin, Sun Zhumei, et al. Research on Scientific Collaboration Prediction Based on the Combination of Network Topology and Node Attributes[J]. Information Studies: Theory & Application, 2019,42(8):116-120, 109.)
|
[7] |
单嵩岩, 吴振新. 面向作者消歧和合作预测领域的作者相似度算法述评[J]. 东北师大学报(自然科学版), 2019,51(2):71-80.
|
[7] |
( Shan Songyan, Wu Zhenxin. Review on the Author Similarity Algorithm in the Field of Author Name Disambiguation and Research Collaboration Prediction[J]. Journal of Northeast Normal University(Natural Science Edition), 2019,51(2):71-80.)
|
[8] |
张金柱, 于文倩, 刘菁婕, 等. 基于网络表示学习的科研合作预测研究[J]. 情报学报, 2018,37(2):132-139.
|
[8] |
( Zhang Jinzhu, Yu Wenqian, Liu Jingjie, et al. Predicting Research Collaborations Based on Network Embedding[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(2):132-139.)
|
[9] |
余传明, 林奥琛, 钟韵辞, 等. 基于网络表示学习的科研合作推荐研究[J]. 情报学报, 2019,38(5):500-511.
|
[9] |
( Yu Chuanming, Lin Aochen, Zhong Yunci, et al. Scientific Collaboration Recommendation Based on Network Embedding[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(5):500-511.)
|
[10] |
Balasubramanian M, Schwartz E L. The Isomap Algorithm and Topological Stability[J]. Science, 2002,295(5552):7.
pmid: 11778013
|
[11] |
Roweis S T, Saul L K. Nonlinear Dimensionality Reduction by Locally Linear Embedding[J]. Science, 2000,290(5500):2323-2326.
|
[12] |
Belkin M, Niyogi P. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering[C]// Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic. 2002: 585-591.
|
[13] |
Chen M, Yang Q, Tang X O. Directed Graph Embedding[C]// Proceedings of the 20th International Joint Conference on Artificial Intelligence. 2007: 2707-2712.
|
[14] |
Ahmed A, Shervashidze N, Narayanamurthy S, et al. Distributed Large-Scale Natural Graph Factorization[C]// Proceedings of the 22nd International Conference on World Wide Web. ACM, 2013: 37-48.
|
[15] |
Perozzi B, Al-Rfou R, Skiena S. DeepWalk: Online Learning of Social Representations[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2014: 701-710.
|
[16] |
Grover A, Leskovec J. node2vec: Scalable Feature Learning for Networks[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016: 855-864.
|
[17] |
Cao S S, Lu W, Xu Q K. GraRep: Learning Graph Representations with Global Structural Information[C]// Proceedings of the 24th ACM International Conference on Information and Knowledge Management. ACM, 2015: 891-900.
|
[18] |
Tang J, Qu M, Wang M Z, et al. LINE: Large-Scale Information Network Embedding[C]// Proceedings of the 24th International Conference on World Wide Web. 2015: 1067-1077.
|
[19] |
Wang D X, Cui P, Zhu W W. Structural Deep Network Embedding[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016: 1225-1234.
|
[20] |
Ou M D, Cui P, Pei J, et al. Asymmetric Transitivity Preserving Graph Embedding[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016: 1105-1114.
|
[21] |
Kipf T N, Welling M. Variational Graph Auto-Encoders[OL]. arXiv Preprint, arXiv: 1611.07308,2016.
|
[22] |
Wang H W, Wang J, Wang J L, et al. GraphGAN: Graph Representation Learning with Generative Adversarial Nets[J]. IEEE Transactions on Knowledge and Data Engineering, DOI:10.1109/TKDE.2019.2961882.
doi: 10.1109/TKDE.2012.149
pmid: 24693210
|
[23] |
Yang C, Liu Z Y, Zhao D L, et al. Network Representation Learning with Rich Text Information[C]// Proceeding of the 24th International Conference on Artificial Intelligence. 2015: 2111-2117.
|
[24] |
Sun X F, Guo J, Ding X, et al. A General Framework for Content-Enhanced Network Representation Learning[OL]. arXiv Preprint, arXiv: 1610.02906,2016.
|
[25] |
Tu C C, Liu H, Liu Z Y, et al. CANE: Context-Aware Network Embedding for Relation Modeling[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1722-1731.
|
[26] |
Lerer A, Wu L, Shen J J, et al. PyTorch-BigGraph: A Large-scale Graph Embedding System[C]// Proceedings of the Conference on Systems and Machine Learning. 2019.
|
[27] |
Fey M, Lenssen J E. Fast Graph Representation Learning with PyTorch Geometric[OL]. arXiv Preprint, arXiv Preprint, arXiv: 1903.02428,2019.
|
[28] |
Zhu Z C, Xu S Z, Tang J, et al. GraphVite: A High-Performance CPU-GPU Hybrid System for Node Embedding[C]// Proceedings of the World Wide Web Conference. ACM, 2019: 2494-2504.
|
[29] |
Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003,3:993-1022.
|
[30] |
Rosen-Zvi M, Griffiths T, Steyvers M, et al. The Author-Topic Model for Authors and Documents[C]// Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence. AUAI Press, 2004: 487-494.
|
[31] |
Snoek J, Larochelle H, Adams R P. Practical Bayesian Optimization of Machine Learning Algorithms[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems. 2012: 2951-2959.
|
[32] |
LeCun Y, Bengio Y, Hinton G. Deep Learning[J]. Nature, 2015,521(7553):436.
doi: 10.1038/nature14539
pmid: 26017442
|
[33] |
Zhang J, Dong Y X, Wang Y, et al. ProNE: Fast and Scalable Network Representation Learning[C]// Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI'19). 2019: 4278-4284
|
[34] |
Qiu J Z, Dong Y X, Ma H, et al. NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization[C]// Proceedings of the World Wide Web Conference. ACM, 2019: 1509-1520.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|