Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (10): 28-36    DOI: 10.11925/infotech.2096-3467.2020.0062
Current Issue | Archive | Adv Search |
Predicting Research Collaboration Based on Translation Model
Chen Wenjie()
Chengdu Library and Information Center, Chinese Academy of Sciences, Chengdu 610041, China
Download: PDF (887 KB)   HTML ( 7
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a modified translation model (TransTopic) to predict research cooperation, aiming to promote exchanges among researchers and maximize efficiency.[Methods] We used TransTopic to uniformly map the nodes and edges of the scientific research cooperation network to low-dimensional vectors. First, we used the LDA model to extract the topic distribution features of stem cells papers. Then, we turned topic features to edge vectors with the deep autoencoder and obtained node vectors based on the translation mechanism. Finally, we predicted the scientific cooperation through the semantic calculation between the vectors.[Results] TransTopic’s AUC (95.21%) and MeanRank (17.48) indicators for link prediction are better than those of the existing models, and its topic prediction accuracy rate reached 86.52%.[Limitations] The proposed method only considered a one-step translation path, and did not fully utilized information like author’s institution, research interests, and publication levels.[Conclusions] The proposed method based on translation model could effectively predict research cooperation in the field of stem cells.

Key wordsTranslation Model      Deep Autoencoder      Topic Model      Link Prediction     
Received: 19 January 2020      Published: 09 November 2020
ZTFLH:  TP391  
Corresponding Authors: Chen Wenjie     E-mail: chenwj@clas.ac.cn

Cite this article:

Chen Wenjie. Predicting Research Collaboration Based on Translation Model. Data Analysis and Knowledge Discovery, 2020, 4(10): 28-36.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.0062     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I10/28

Translation Mechanism
Deep Autoencoder
Architecture of TransTopic
数据集 Net-S Net-M Net-L
节点集 59 586 354 740 632 893
边集 264 174 1 605 263 3 026 951
训练集 211 339 1 284 210 2 421 561
测试集 52 835 321 053 605 390
Dataset
模型 Net-S Net-M Net-L
DeepWalk 59.26% 53.15% 52.45%
Node2Vec 72.42% 64.77% 62.95%
LINE 71.59% 65.62% 61.38%
TransE 84.83% 77.56% 69.42%
TransTopic 95.21% 87.13% 80.26%
AUC
模型 Net-S Net-M Net-L
DeepWalk 137.52 190.84 229.26
Node2Vec 83.34 134.48 171.04
LINE 92.81 105.64 159.44
TransE 39.40 60.89 87.32
TransTopic 17.48 26.45 51.27
MeanRank
排名 作者1 作者2 共同主题个数
1 Pearson, Bret J. Beerman, Isabel 20
2 Yu, Jennifer S. Rich, Jeremy N. 19
3 Chen, Chang-Zheng Hassanshahi, Mohammad 19
4 Zhu, Lian Wan, Wu 19
5 Miere, Cristian Wood, Victoria 18
Top 5 Partnerships
d Net-S Net-M Net-L
20 82.17% 80.95% 76.54%
70 86.52% 84.73% 81.65%
120 81.23% 79.24% 73.28%
Topic Prediction
[1] Guns R, Rousseau R. Recommending Research Collaborations Using Link Prediction and Random Forest Classifiers[J]. Scientometrics, 2014,101(2):1461-1473.
doi: 10.1007/s11192-013-1228-9
[2] 张金柱, 王小梅, 韩涛. 文献-作者二分网络中基于路径组合的合著关系预测研究[J]. 现代图书情报技术, 2016 (10):42-49.
[2] ( Zhang Jinzhu, Wang Xiaomei, Han Tao. Predicting Co-authorship with Combination of Paths in Paper-author Bipartite Networks[J]. New Technology of Library and Information Service, 2016 (10):42-49.)
[3] 张金柱, 于文倩, 刘菁婕, 等. 基于网络表示学习的科研合作预测研究[J]. 情报学报, 2018,37(2):132-139.
[3] ( Zhang Jinzhu, Yu Wenqian, Liu Jingjie, et al. Predicting Research Collaborations Based on Network Embedding[J]. Journal of the China Society for Scientific and Technical Information, 2018,37(2):132-139.)
[4] 林原, 刘海峰, 王海龙, 等. 基于表示学习的学者间潜在合作机会挖掘[J]. 情报杂志, 2019,38(5):65-70.
[4] ( Lin Yuan, Liu Haifeng, Wang Hailong, et al. Potential Cooperation Opportunities Exploration Between Scholars Based on Presentation Learning[J]. Journal of Intelligence, 2019,38(5):65-70.)
[5] Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2. 2013: 3111-3119.
[6] Bordes A, Usunier N, Garcia-Durán A, et al. Translating Embeddings for Modeling Multi-Relational Data[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems-Volume 2. 2013: 2787-2795.
[7] 刘知远, 孙茂松, 林衍凯, 等. 知识表示学习研究进展[J]. 计算机研究与发展, 2016,53(2):247-261.
[7] ( Liu Zhiyuan, Sun Maosong, Lin Yankai, et al. Knowledge Representation Learning: A Review[J]. Journal of Computer Research and Development, 2016,53(2):247-261.)
[8] Wang Z W, Zhang J L, Feng J, et al. Knowledge Graph Embedding by Translating on Hyperplanes[C]//Proceedings of the 28th AAAI Conference on Artificial Intelligence. 2014: 1112-1119.
[9] Lin Y K, Liu Z Y, Sun M S, et al. Learning Entity and Relation Embedding for Knowledge Graph Completion[C]//Proceedings of AAAI 2015. 2015:2181-2187.
[10] Xiao H, Huang M L, Hao Y, et al. TransG: A Generative Mixture Model for Knowledge Graph Embedding[OL]. arXiv Preprint, arXiv:1509.05488, 2015.
[11] He S Z, Liu K, Ji G L, et al. Learning to Represent Knowledge Graphs with Gaussian Embedding[C]//Proceedings of the 24th ACM International Conference on Information and Knowledge Management. 2015: 623-632.
[12] Xiao H, Huang M L, Hao Y, et al. TransA: An Adaptive Approach for Knowledge Graph Embedding[OL]. arXiv Preprint, arXiv: 1509.05490, 2015.
[13] 方阳, 赵翔, 谭真, 等. 一种改进的基于翻译的知识图谱表示方法[J]. 计算机研究与发展, 2018,55(1):139-150.
[13] ( Fang Yang, Zhao Xiang, Tan Zhen, et al. A Revised Translation-Based Method for Knowledge Graph Representation[J]. Journal of Computer Research and Development, 2018,55(1):139-150.)
[14] Lin Y K, Liu Z Y, Luan H B, et al. Modeling Relation Paths for Representation Learning of Knowledge Bases[OL]. arXiv Preprint, arXiv:1506.00379, 2015.
[15] Xie R B, Liu Z Y, Jia J, et al. Representation Learning of Knowledge Graphs with Entity Descriptions[C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016: 2659-2665.
[16] Newman M E J. The Structure and Function of Complex Networks[J]. SIAM Review, 2003,45(2):167-256.
doi: 10.1137/S003614450342480
[17] Liben-Nowell D, Kleinberg J. The Link Prediction Problem for Social Networks[J]. Journal of the American Society for Information Science and Technology, 2003,58(7):1019-1031.
doi: 10.1002/(ISSN)1532-2890
[18] 孙晓玲. 作者合作网络的结构及其演化与预测研究[D]. 大连: 大连理工大学, 2014.
[18] ( Sun Xiaoling. Research on the Structure, Evolution and Prediction of Author Cooperative Network[D]. Dalian: Dalian University of Technology, 2014.)
[19] 刘竟, 孙薇. 基于链路预测的潜在科研合作关系发现研究[J]. 情报理论与实践, 2017,40(7):88-92, 121.
[19] ( Liu Jing, Sun Wei. Discovery of Potential Scientific and Technical Collaborative Relationship Based on Link Prediction[J]. Information Studies: Theory & Application, 2017,40(7):88-92, 121.)
[20] 汪志兵, 韩文民, 孙竹梅, 等. 基于网络拓扑结构与节点属性特征融合的科研合作预测研究[J]. 情报理论与实践, 2019,42(8):116-120, 109.
[20] ( Wang Zhibing, Han Wenmin, Sun Zhumei, et al. Research on Scientific Collaboration Prediction Based on the Combination of Network Topology and Node Attributes[J]. Information Studies:Theory & Application, 2019,42(8):116-120, 109.)
[21] 张金柱, 韩涛, 王小梅. 作者-关键词二分网络中的合著关系预测研究[J]. 图书情报工作, 2016,60(21):74-80.
[21] ( Zhang Jinzhu, Han Tao, Wang Xiaomei. Co-authorship Prediction in the Author-keyword Bipartite Networks[J]. Library and Information Service, 2016,60(21):74-80.)
[22] Luong N T, Nguyen T T, Jung J J, et al. Discovering Co-author Relationship in Bibliographic Data Using Similarity Measures and Random Walk Model[C]//Proceedings of 2015 Asian Conference on Intelligent Information and Database Systems. 2015: 127-136.
[23] 艾科, 马国帅, 杨凯凯, 等. 一种基于集成学习的科研合作者潜力预测分类方法[J]. 计算机研究与发展, 2019,56(7):1383-1395.
[23] ( Ai Ke, Ma Guoshuai, Yang Kaikai, et al. A Classification Method of Scientific Collaborator Potential Prediction Based on Ensemble Learning[J]. Journal of Computer Research and Development, 2019,56(7):1383-1395.)
[24] 余传明, 林奥琛, 钟韵辞, 等. 基于网络表示学习的科研合作推荐研究[J]. 情报学报, 2019,38(5):500-511.
[24] ( Yu Chuanming, Lin Aochen, Zhong Yunci, et al. Scientific Collaboration Recommendation Based on Network Embedding[J]. Journal of the China Society for Scientific and Technical Information, 2019,38(5):500-511.)
[25] Blei D M, Ng A Y, Jordan M I, et al. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003,3(4/5):993-1022.
[26] Tu C C, Zhang Z Y, Liu Z Y, et al. TransNet: Translation-Based Network Representation Learning for Social Relation Extraction[C]// Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017: 2864-2870.
[27] Schölkopf B, Platt J, Hofmann T. Greedy Layer-Wise Training of Deep Networks[A]//Advances in Neural Information Processing Systems[M]. MIT Press, 2007: 153-160.
[28] 孙丽娟. 科技论文作者署名排序与通讯作者[J]. 中国科技期刊研究, 2005,16(2):242-244.
[28] ( Sun Lijuan. Order of Authors and Corresponding Author in Scientific Papers[J]. Chinese Journal of Scientific and Technical Periodicals, 2005,16(2):242-244.)
[29] Srivastava N, Hinton G, Krizhevsky A, et al. Dropout: A Simple Way to Prevent Neural Networks from Overfitting[J]. Journal of Machine Learning Research, 2014,15(1):1929-1958.
[30] 李志宇, 梁循, 周小平, 等. 一种大规模网络中基于节点结构特征映射的链接预测方法[J]. 计算机学报, 2016,39(10):1947-1964.
[30] ( Li Zhiyu, Liang Xun, Zhou Xiaoping, et al. A Link Prediction Method for Large-Scale Networks[J]. Chinese Journal of Computer, 2016,39(10):1947-1964.)
[1] Yu Chuanming,Yuan Sai,Zhu Xingyu,Lin Hongjun,Zhang Puliang,An Lu. Research on Deep Learning Based Topic Representation of Hot Events[J]. 数据分析与知识发现, 2020, 4(4): 1-14.
[2] Pan Youneng,Ni Xiuli. Recommending Online Medical Experts with Labeled-LDA Model[J]. 数据分析与知识发现, 2020, 4(4): 34-43.
[3] Xu Jianmin,Zhang Liqing,Wang Miao. Tracking Static Topics with Bayesian Network[J]. 数据分析与知识发现, 2020, 4(2/3): 200-206.
[4] Chuanming Yu,Haonan Li,Manyi Wang,Tingting Huang,Lu An. Knowledge Representation Based on Deep Learning:Network Perspective[J]. 数据分析与知识发现, 2020, 4(1): 63-75.
[5] Hongfei Ling,Shiyan Ou. Review of Automatic Labeling for Topic Models[J]. 数据分析与知识发现, 2019, 3(9): 16-26.
[6] Weimin Nie,Yongzhou Chen,Jing Ma. A Text Vector Representation Model Merging Multi-Granularity Information[J]. 数据分析与知识发现, 2019, 3(9): 45-52.
[7] Qingtian Zeng,Xiaohui Hu,Chao Li. Extracting Keywords with Topic Embedding and Network Structure Analysis[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[8] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[9] Peiyao Zhang,Dongsu Liu. Topic Evolutionary Analysis of Short Text Based on Word Vector and BTM[J]. 数据分析与知识发现, 2019, 3(3): 95-101.
[10] Linna Xi,Yongxiang Dou. Examining Reposts of Micro-bloggers with Planned Behavior Theory[J]. 数据分析与知识发现, 2019, 3(2): 13-20.
[11] Jie Zhang,Junbo Zhao,Dongsheng Zhai,Ningning Sun. Patent Technology Analysis of Microalgae Biofuel Industrial Chain Based on Topic Model[J]. 数据分析与知识发现, 2019, 3(2): 52-64.
[12] Junwan Liu,Zhixin Long,Feifei Wang. Finding Collaboration Opportunities from Emerging Issues with LDA Topic Model and Link Prediction[J]. 数据分析与知识发现, 2019, 3(1): 104-117.
[13] Zhang Tao,Ma Haiqun. Clustering Policy Texts Based on LDA Topic Model[J]. 数据分析与知识发现, 2018, 2(9): 59-65.
[14] Yu Yan,Zhao Naixuan. Weighted Topic Model for Patent Text Analysis[J]. 数据分析与知识发现, 2018, 2(4): 81-89.
[15] Li He,Zhu Linlin,Yan Min,Liu Jincheng,Hong Chuang. Identifying Useful Information from Open Innovation Community[J]. 数据分析与知识发现, 2018, 2(12): 12-22.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn