[Objective] This paper proposes a citation prediction model for scholarly articles, which could identify potential research hot spots and optimize journal editing.[Methods] First, we used graph convolution to extract literature features, which include keywords, authors, institutions, countries, and citations. Then, we used recurrent neural network and attention model to examine the time-series information of citations and other features.[Results] We evaluated the proposed model with transportation articles from core journals indexed by the Web of Science. Compared with the benchmark model, our new method’s maximum improvements on RMSE and MAE were 15.23% and 16.91%.[Limitations] At the pre-training stage, our model adopted multiple graph convolutions, which was very time consuming.[Conclusions] The proposed model, which fully integrates literature features, could effectively predict their citations.
张思凡,牛振东,陆浩,朱一凡,王荣荣. 基于图卷积嵌入与特征交叉的文献被引量预测方法:以交通运输领域为例*[J]. 数据分析与知识发现, 2020, 4(9): 56-67.
Zhang Sifan,Niu Zhendong,Lu Hao,Zhu Yifan,Wang Rongrong. Predicting Citations Based on Graph Convolution Embedding and Feature Cross:Case Study of Transportation Research. Data Analysis and Knowledge Discovery, 2020, 4(9): 56-67.
Abrishami A, Aliakbary S. Predicting Citation Counts Based on Deep Neural Network Learning Techniques[J]. Journal of Informetrics, 2019,13(2):485-499.
doi: 10.1016/j.joi.2019.02.011
[2]
Garfield E. The Use of Journal Impact Factors and Citation Analysis for Evaluation of Science[C] //Proceedings of the 41st Annual Meeting of the Council of Biology Editors, Salt Lake City, UT. 1998.
[3]
Hirsch J E. An Index to Quantify an Individual’s Scientific Research Output[J]. Proceedings of the National Academy of Sciences of the United States of America, 2005,102(46):16569-16572.
pmid: 16275915
[4]
Garfield E. The History and Meaning of the Journal Impact Factor[J]. JAMA: The Journal of the American Medical Association, 2006,295(1):90-93.
doi: 10.1001/jama.295.1.90
pmid: 16391221
[5]
Abramo G, D’Angelo C A, Felici G. Predicting Publication Long-Term Impact Through a Combination of Early Citations and Journal Impact Factor[J]. Journal of Informetrics, 2019,13(1):32-49.
doi: 10.1016/j.joi.2018.11.003
[6]
Kosteas V D. Predicting Long-Run Citation Counts for Articles in Top Economics Journals[J]. Scientometrics, 2018,115(3):1395-1412.
doi: 10.1007/s11192-018-2703-0
[7]
Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate[OL]. arXiv Preprint, arXiv:1409.0473.
[8]
Fiala D, Tutoky G. PageRank-Based Prediction of Award-Winning Researchers and the Impact of Citations[J]. Journal of Informetrics, 2017,11(4):1044-1068.
doi: 10.1016/j.joi.2017.09.008
[9]
Bütün E, Kaya M, Alhajj R. A Supervised Learning Method for Prediction Citation Count of Scientists in Citation Networks[C] // Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. IEEE, 2017: 952-958.
[10]
Zhang Z, Cui P, Zhu W. Deep Learning on Graphs: A Survey[OL]. arXiv Preprint, arXiv: 1812.04202.
[11]
Lü Y, Duan Y, Kang W, et al. Traffic Flow Prediction with Big Data: A Deep Learning Approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2015,16(2):865-873.
[12]
Zhang Q, Yang L T, Chen Z, et al. A Survey on Deep Learning for Big Data[J]. Information Fusion, 2018,42:146-157.
doi: 10.1016/j.inffus.2017.10.006
[13]
Cho H, Choi I S. Deep Learning Algorithm of Graph Convolutional Network: A Case of Aqueous Solubility Problems[J]. Bulletin of the Korean Chemical Society, 2019,40(6):485-486.
doi: 10.1002/bkcs.2019.40.issue-6
[14]
Goyal P, Ferrara E. Graph Embedding Techniques, Applications, and Performance: A Survey[J]. Knowledge Based Systems, 2018,151:78-94.
doi: 10.1016/j.knosys.2018.03.022
[15]
Wang S, Zhu W. Sparse Graph Embedding Unsupervised Feature Selection[J]. IEEE Transactions on Systems, Man, Cybernetics: Systems, 2016,48(3):329-341.
doi: 10.1109/TSMC.2016.2605132
[16]
Luo X, Zhang L, Li F, et al. Graph Embedding-Based Ensemble Learning for Image Clustering[C] // Proceedings of the 24th International Conference on Pattern Recognition. IEEE, 2018: 213-218.
[17]
Feng J, Huang M, Yang Y, et al. GAKE: Graph Aware Knowledge Embedding[C] //Proceedings of the 26th International Conference on Computational Linguistics. 2016: 641-651.
[18]
Nie F, Zhu W, Li X. Unsupervised Large Graph Embedding[C] // Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017: 2422-2428.
[19]
Acuna D E, Allesina S, Kording K P. Predicting Scientific Success[J]. Nature, 2012,489(7415):201-202.
doi: 10.1038/489201a
pmid: 22972278
[20]
Sun L, Yin Y. Discovering Themes and Trends in Transportation Research Using Topic Modeling[J]. Transportation Research Part C: Emerging Technologies, 2017,77:49-66.
doi: 10.1016/j.trc.2017.01.013
[21]
Li L, Li X, Li Z, et al. A Bibliographic Analysis of the IEEE Transactions on Intelligent Transportation Systems Literature[J]. IEEE Transactions on Intelligent Transportation Systems, 2010,11(2):251-255.
doi: 10.1109/TITS.2010.2049890
[22]
Xu X, Wang W, Liu Y, et al. A Bibliographic Analysis and Collaboration Patterns of IEEE Transactions on Intelligent Transportation Systems Between 2000 and 2015[J]. IEEE Transactions on Intelligent Transportation Systems, 2016,17(8):2238-2247.
doi: 10.1109/TITS.2016.2519038
[23]
Zhao X, Wang T, Lu H, et al. A Bibliographic and Coauthorship Analysis of IEEE T-ITS Literature Between 2014 and 2016[J]. IEEE Transactions on Intelligent Transportation Systems, 2018,19(9):2751-2761.
doi: 10.1109/TITS.2017.2767062
[24]
Cobo M J, Chiclana F, Collop A, et al. A Bibliometric Analysis of the Intelligent Transportation Systems Research Based on Science Mapping[J]. IEEE Transactions on Intelligent Transportation Systems, 2014,15(2):901-908.
doi: 10.1109/TITS.2013.2284756
[25]
Tian X, Geng Y, Zhong S, et al. A Bibliometric Analysis on Trends and Characters of Carbon Emissions from Transport Sector[J]. Transportation Research Part D: Transport and Environment, 2018,59:1-10.
doi: 10.1016/j.trd.2017.12.009
[26]
Das S, Dixon K, Sun X, et al. Trends in Transportation Research: Exploring Content Analysis in Topics[J]. Transportation Research Record, 2017,2614:27-38.
doi: 10.3141/2614-04
[27]
Sarigöl E, Pfitzner R, Scholtes I, et al. Predicting Scientific Success Based on Coauthorship Networks[J]. EPJ Data Science, 2014,3(1):9-20.
doi: 10.1140/epjds/s13688-014-0009-x
[28]
Pobiedina N, Ichise R. Citation Count Prediction as a Link Prediction Problem[J]. Applied Intelligence, 2016,44(2):252-268.
doi: 10.1007/s10489-015-0657-y
[29]
Daud A, Ahmed W, Amjad T, et al. Who Will Cite You Back? Re-Ciprocal Link Prediction in Citation Networks[J]. Library Hi Tech, 2017,35(4):509-520.
doi: 10.1108/LHT-02-2017-0044
[30]
Klimek P, Jovanovic A S, Egloff R, et al. Successful Fish Go with the Flow: Citation Impact Prediction Based on Centrality Measures for Term-Document Networks[J]. Scientometrics, 2016,107(3):1265-1282.
doi: 10.1007/s11192-016-1926-1
[31]
Dong Y, Johnson R A, Chawla N V. Can Scientific Impact be Predicted?[J]. IEEE Transactions on Big Data 2016,2(1):18-30.
doi: 10.1109/TBDATA.2016.2521657
Bornmann L, Leydesdorff L, Wang J. How to Improve the Prediction Based on Citation Impact Percentiles for Years Shortly After the Publication Date?[J]. Journal of Informetrics, 2014,8(1):175-180.
doi: 10.1016/j.joi.2013.11.005
[34]
Lamb C T, Gilbert S L, Ford A T. Tweet Success? Scientific Communication Correlates with Increased Citations in Ecology and Conservation[J]. PeerJ, 2018,6:e4564.
doi: 10.7717/peerj.4564
pmid: 29666750
[35]
Cao X, Chen Y, Liu K J R. A Data Analytic Approach to Quantifying Scientific Impact[J]. Journal of Informetrics, 2016,10(2):471-484.
doi: 10.1016/j.joi.2016.02.006
[36]
Girvan M, Newman M E J. Community Structure in Social and Biological Networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2002,99(12):7821-7826.
doi: 10.1073/pnas.122653799
pmid: 12060727
[37]
Wang W, Lu Y. Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model[J]. IOP Conference Series: Materials Science and Engineering, 2018,324(1):012049.
doi: 10.1088/1757-899X/324/1/012049
[38]
Gelman A, Goodrich B, Gabry J, et al. R-squared for Bayesian Regression Models[J]. The American Statistician, 2019,73(3):307-309.
doi: 10.1080/00031305.2018.1549100