【目的】为促进科研人员之间的交流合作,提出一种融合异质网络与表示学习的科研合作预测方法。【方法】运用学者、机构、论文、期刊等信息构建异质科研合作网络,根据网络中包含的学者之间不同的共现关系,将该异质网络划分为三种同质共现网络,再进一步利用Node2Vec和Doc2Vec算法分别学习学者的网络结构特征向量和内容属性特征向量,并进行融合。最后通过计算学者向量之间的余弦相似度进行合作预测。【结果】采用Web of Science数据库中人工智能领域的论文数据进行对比实验,本文所提预测方法的AUC值和F1值分别达到0.987 9和0.942 4,优于基线方法。【局限】对学者内容特征的表示没有考虑到学者的研究主题。【结论】本文方法考虑了学者的结构和内容属性,并结合异质网络,融合了机构、论文、期刊等多方面信息,能够得到更好的合作预测效果。
[Objective] This paper proposes a prediction method based on heterogeneous networks and representation learning. It tries to promote exchanges and cooperation among scientific researchers. [Methods] First, we constructed a heterogeneous scientific research cooperation network with information on scholars, institutions, papers, and journals. According to the different co-occurrence relationships among scholars included in the network, we divided the heterogeneous network into three types of homogenous co-occurrence networks. Then, we used Node2Vec and Doc2Vec to learn the network structure and content attribute features of scholars, respectively. Finally, we merged them to calculate the cosine similarity between scholars. [Results] We examined the new method with datasets in artificial intelligence from WOS. The proposed method’s predicted AUC and F1 values reached 0.987 9 and 0.942 4, respectively, outperforming the baseline methods. [Limitations] The representation of scholar content characteristics does not consider the scholar’s research topics. [Conclusions] The proposed model includes the scholar’s structure and content attributes. It also combines heterogeneous networks and integrates various information, including institutions, papers, and journals. The new method can predict scientific cooperation more effectively.
李慧, 刘莎, 胡耀华, 孟玮. 融合异质网络与表示学习的科研合作预测方法研究*[J]. 数据分析与知识发现, 2023, 7(9): 78-88.
Li Hui, Liu Sha, Hu Yaohua, Meng Wei. Predicting Scientific Research Cooperation with Heterogeneous Network and Representation Learning. Data Analysis and Knowledge Discovery, 2023, 7(9): 78-88.
Kanakia A, Shen Z, Eide D, et al. A Scalable Hybrid Research Paper Recommender System for Microsoft Academic[C]// Proceedings of the World Wide Web Conference. 2019: 2893-2899.
[2]
Abramo G, D’Angelo C A, Di Costa F. The Collaboration Behavior of Top Scientists[J]. Scientometrics, 2019, 118(1): 215-232.
doi: 10.1007/s11192-018-2970-9
[3]
Fan S, Zhu J, Han X, et al. Metapath-guided Heterogeneous Graph Neural Network for Intent Recommendation[C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019: 2478-2486.
[4]
Liu J, Xia F, Wang L, et al. Shifu2: A Network Representation Learning Based Model for Advisor-Advisee Relationship Mining[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 33(4): 1763-1777.
[5]
Lu Y, Shi C, Hu L, et al. Relation Structure-aware Heterogeneous Information Network Embedding[C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2019: 4456-4463.
[6]
Coccia M. The Evolution of Scientific Disciplines in Applied Sciences: Dynamics and Empirical Properties of Experimental Physics[J]. Scientometrics, 2020, 124(1): 451-487.
doi: 10.1007/s11192-020-03464-y
[7]
Newman M E J. The Structure of Scientific Collaboration Networks[J]. PNAS, 2001, 98(2): 404-409.
doi: 10.1073/pnas.021544898
pmid: 11149952
[8]
Newman M E J. Scientific Collaboration Networks. I. Network Construction and Fundamental Results[J]. Physical Review E, 2001, 64(1): 016131.
doi: 10.1103/PhysRevE.64.016131
[9]
Newman M E J. Scientific Collaboration Networks. II. Shortest Paths, Weighted Networks, and Centrality[J]. Physical Review E, 2001, 64(1): 016132.
doi: 10.1103/PhysRevE.64.016132
[10]
Newman M E J, Girvan M. Finding and Evaluating Community Structure in Networks[J]. Physical Review E, 2004, 69(2): 026113.
doi: 10.1103/PhysRevE.69.026113
[11]
Martínez V, Berzal F, Cubero J C. A Survey of Link Prediction in Complex Networks[J]. ACM Computing Surveys, 2016, 49(4): 1-33.
[12]
Jaccard P. Étude Comparative de la Distribution Florale dans une Portion des Alpes et des Jura[J]. Bulletin Société Vaudoise Science Nature, 1901, 37: 547-579.
[13]
Adamic L A, Adar E. Friends and Neighbors on the Web[J]. Social Networks, 2003, 25(3): 211-230.
doi: 10.1016/S0378-8733(03)00009-1
[14]
Liben-Nowell D, Kleinberg J. The Link-Prediction Problem for Social Networks[J]. Journal of the American Society for Information Science and Technology, 2007, 58(7): 1019-1031.
doi: 10.1002/asi.v58:7
[15]
Zhou T, Lü L, Zhang Y C. Predicting Missing Links via Local Information[J]. The European Physical Journal B, 2009, 71(4): 623-630.
doi: 10.1140/epjb/e2009-00335-8
[16]
Rafiee S, Salavati C, Abdollahpouri A. CNDP: Link Prediction Based on Common Neighbors Degree Penalization[J]. Physica A: Statistical Mechanics and Its Applications, 2020, 539: 122950.
doi: 10.1016/j.physa.2019.122950
(Ding Jingda, Guo Jie. Mining Potential Author Cooperative Relationships Based on the Similarity of Content and Path[J]. Information Studies: Theory & Application, 2021, 44(1): 124-128, 123.)
(Wang Zhibing, Han Wenmin, Sun Zhumei, et al. Research on Scientific Collaboration Prediction Based on the Combination of Network Topology and Node Attributes[J]. Information Studies: Theory & Application, 2019, 42(8): 116-120, 109.)
[19]
Yao Y, Zhang R, Yang F, et al. Link Prediction in Complex Networks Based on the Interactions among Paths[J]. Physica A: Statistical Mechanics and Its Applications, 2018, 510: 52-67.
doi: 10.1016/j.physa.2018.06.051
(Zhang Jinzhu, Yu Wenqian, Liu Jingjie, et al. Predicting Research Collaborations Based on Network Embedding[J]. Journal of the China Society for Scientific and Technical Information, 2018, 37(2): 132-139.)
(Yu Chuanming, Lin Aochen, Zhong Yunci, et al. Scientific Collaboration Recommendation Based on Network Embedding[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(5): 500-511.)
(Zhang Xin, Wen Yi, Xu Haiyun. A Prediction Model with Network Representation Learning and Topic Model for Author Collaboration[J]. Data Analysis and Knowledge Discovery, 2021, 5(3): 88-100.)
[24]
Abramo G, D’Angelo C A, Di Costa F. The Collaboration Behavior of Top Scientists[J]. Scientometrics, 2019, 118(1): 215-232.
doi: 10.1007/s11192-018-2970-9
[25]
Wang W, Liu J, Yang Z, et al. Sustainable Collaborator Recommendation Based on Conference Closure[J]. IEEE Transactions on Computational Social Systems, 2019, 6(2): 311-322.
doi: 10.1109/TCSS.2019.2898198
[26]
Bornmann L, Leydesdorff L. Topical Connections Between the Institutions Within an Organisation (Institutional Co-authorships, Direct Citation Links and Co-citations)[J]. Scientometrics, 2015, 102(1): 455-463.
doi: 10.1007/s11192-014-1425-1
[27]
Ding Y, Li X. Time Weight Collaborative Filtering[C]// Proceedings of the 14th ACM International Conference on Information and Knowledge Management. New York: ACM, 2005: 485-492.
[28]
Hagen N T. Harmonic Allocation of Authorship Credit: Source-level Correction of Bibliometric Bias Assures Accurate Publication and Citation Analysis[J]. PLoS One, 2008, 3(12): e4021.
doi: 10.1371/journal.pone.0004021
Wang W, Yu S, Bekele T M, et al. Scientific Collaboration Patterns Vary with Scholars’ Academic Ages[J]. Scientometrics, 2017, 112(1): 329-343.
doi: 10.1007/s11192-017-2388-9
[31]
Barabasi A L, Albert R. Emergence of Scaling in Random Networks[J]. Science, 1999, 286(5439): 509-512.
doi: 10.1126/science.286.5439.509
pmid: 10521342
[32]
Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online Learning of Social Representations[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014: 701-710.
[33]
Tang J, Qu M, Wang M, et al. Line: Large-scale Information Network Embedding[C]// Proceedings of the 24th International Conference on World Wide Web. 2015: 1067-1077.
[34]
Wang D, Cui P, Zhu W. Structural Deep Network Embedding[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016: 1225-1234.