[Objective] This study aims to establish a fair and objective evaluation mechanism for academic impacts, aiming to solve the issues like huge appraisal system, complicated calculation and vague conclusion. [Methods] We proposed a ranking method for each scholar’s impacts based on citation behavior and academic similarity, as well as with the help of Word2Vec, TF-IDF, and PageRank algorithms. [Results] The proposed method combined the influence of a researcher’s scholarly relationship and academic outputs. It has excellent performance in the validity dimension: the relevance of H index and the center of the feature vector with the PR value were 0.872 and 0.617, respectively. The proposed evaluation index could replace the traditional metrics. The average H-index and citation frequency of the scholars within the fixed-ranking interval both increased. The average H-index of the top 100 scholars increased by 1.087 and the average cited frequency increased by 2.080, which were better than the original PageRank algorithm. [Limitations] The efficiency of the proposed algorithm was lower than the PageRank algorithm. [Conclusions] Our new algorithm could be used to analyze academic networks with a large number of nodes. The node’s PR value will be more accurate as the network quality expands. Therefore, the new ranking algorithm could effectively evaluate the academic impacts of many scholars from multi-disciplinary fields, and has better performance than the existing ones.
刘俊婉, 杨波, 王菲菲. 基于引证行为与学术相似度的学者影响力领域排名方法研究*[J]. 数据分析与知识发现, 2018, 2(4): 59-70.
Liu Junwan,Yang Bo,Wang Feifei. Ranking Scholarly Impacts Based on Citations and Academic Similarity. Data Analysis and Knowledge Discovery, 2018, 2(4): 59-70.
Editorial. Pros and Cons of Open Peer Review[J]. Nature Neuroscience, 1999, 2(3): 197-198.
doi: 10.1038/6295
pmid: 10195206
[2]
Hirsch J E.An Index to Quantify an Individual’s Scientific Research Output[J]. Proceedings of the National Academy of Sciences of the United States of America, 2005, 102(46): 16569-16572.
doi: 10.1073/pnas.0507655102
[3]
Alberts B.Impact Factor Distortions[J]. Science, 2013, 340(6134): 787.
doi: 10.1126/science.1240319
(Liu Xuan, Duan Yufeng, Zhu Qinghua.Study on Evaluation Methods of Academic Talents Based on Co-author Network[J]. Journal of Information, 2014, 33(12): 77-82.)
(Wang Yanyu, Chi Tian.The Deification Paradigm of Scientific Text Research and Its Transformation[J]. Science of Science Research, 2009, 27(3): 328-333.)
[6]
Page L, Brin S, Motwani R, et al. The PageRank Citation Ranking: Bringing Order to the Web[R/OL]. Stanford InfoLab, 1999. .
(Li Zhongmou.ScholarRank: A New Method for Evaluating the Influence of Academic Papers[J]. Information Studies: Theory and Practice, 2014, 37(7): 102-105.)
[8]
Brin S, Page L.The Anatomy of a Large-scale Hypertextual Web Search Engine[J]. Computer Networks and ISDN Systems, 1998, 30(1-7): 107-117.
doi: 10.1016/S0169-7552(98)00110-X
[9]
Wallach H M.Topic Modeling: Beyond Bag-of-Words[C]// Proceedings of the 23rd International Conference on Machine Learning. ACM, 2006: 977-984.
[10]
Uijlings J R R, Smeulders A W M, Scha R J H. Real-time Bag of Words, Approximately[C]// Proceedings of the ACM International Conference on Image and Video Retrieval. ACM, 2009.
[11]
Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Compositionality[C]// Advances in Neural Information Processing Systems 26. 2013: 3111-3119.
[12]
Mikolov T, Chen K, Corrado G, et al. Efficient Estimation of Word Representations in Vector Space[OL]. arXiv Preprint, arXiv:1301.1781v3.
[13]
Salton G, Yu C T.On the Construction of Effective Vocabularies for Information Retrieval[C]// Proceedings of the 1973 Meeting on Programming Languages and Information Retrieval. 1973: 48-60.
(Fu Yuan, Zhu Lijun, Han Hongqi.Research Progress of the Method of Name Disambiguation[J]. Intelligence Engineering, 2016, 2(1): 53-58.)
doi: 10.3772/j.issn.2095-915x.2016.01.007
(Ren Jinghua.Using the Optimized DBSCAN Algorithm for Disambiguation of the Names of the Authors[J]. Library Theory and Practice, 2014(12): 61-65.)
[17]
Larsen B, Aone C.Fast and Effective Text Mining Using Linear-time Document Clustering[C]// Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1999: 16-22.
[18]
Wang X, McCallum A. Topics Over Time: A Non-Markov Continuous-Time Model of Topical Trends[C]// Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2006: 424-433.
[19]
Huang A.Similarity Measures for Text Document Clustering[C]//Proceedings of the 14th Annual New Zealand Computer Science Research Student Conference, New Zealand. 2008: 49-56.
[20]
Zhao D, Strotmann A. Counting First, Last, or All Authors in Citation Analysis: A Comprehensive Comparison in the Highly Collaborative Stem Cell Research Field[J]. Journal of the American Society for Information Science & Technology, 2011, 62(4): 654-676.
doi: 10.1002/asi.21495
[21]
Persson O.All Author Citations Versus First Author Citations[J]. Scientometrics, 2001, 50(2): 339-344.
doi: 10.1023/A:1010534009428
[22]
周金梦. 基于学术异构网络的学者影响力评估算法[D]. 大连: 大连理工大学, 2016.
[22]
(Zhou Jinmeng.Scholar’s Influence Evaluation Algorithm Based on Academic Heterogeneous Network[D]. Dalian: Dalian University of Technology, 2016.)
[23]
孟德尔. 植物杂交的试验[M]. 北京: 科学出版社, 1958.
[23]
(Mendel G J.Plant Hybridization Test[M]. Beijing: Science Press, 1958.)
(Feng Yongkang, Tian Ming, Yang Haiyan, et al.Contemporary Chinese Genetics Academic Pedigree [M]. Shanghai: Shanghai Jiaotong University Press, 2016: 45-46.)
(Kan Lianhe, Huang Xiaoli, Liu Meishen.The Development Trend of Interdisciplinary Information Science - The Enlightenment of Citation Analysis of China’s Information Science Journals[J]. Journal of Modern Informaiton, 2007, 27(1): 62-64.)
doi: 10.3969/j.issn.1008-0821.2007.01.022