|
|
Predicting Social Media Visibility of Scholarly Articles |
Li Gang,Guan Weidong,Ma Yaxue( ),Mao Jin |
Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China |
|
|
Abstract [Objective] This study tries to predict visibility of research papers on Twitter with their multidimensional features, aiming to find important factors affecting social media visibility. [Methods] First, we decided each paper’s social media visibility by its total mentions on Twitter, and extracted features from paper contents, authorship and publishing journals. Then, we constructed a binary classification model to predict each paper’s Twitter visibility. Finally, we examined our model with papers on diabetes to evaluate the performance of different algorithms and the importance of all features. [Results] LightGBM had the best performance with an accuracy of 0.70. Features from contents, authorship and publishing journals all influenced an article’s visibility on social media, while a journal’s annual average impact factor was the most important one. [Limitations] We only examined visiblity of diabete related papers on Twitter. [Conclusions] Ensemble learning algorithm is an effective method to predict social media visibility of scholarly articles, while features of the publishing journals are the key factors.
|
Received: 21 February 2020
Published: 14 September 2020
|
|
Corresponding Authors:
Ma Yaxue
E-mail: myx_vicky@whu.edu.cn
|
[1] |
Holmberg K, Park H W. An Altmetric Investigation of the Online Visibility of South Korea-based Scientific Journals[J]. Scientometrics, 2018,117(1):603-613.
|
[2] |
Kjellberg S, Haider J. Researchers’ Online Visibility: Tensions of Visibility, Trust and Reputation[J]. Online Information Review, 2019,43(3):426-439.
|
[3] |
Bar-Ilan J, Haustein S, Peters I, et al. Beyond Citations: Scholars’ Visibility on the Social Web[C]// Proceedings of the 17th International Conference on Science and Technology Indicators, Montreal, Canada. 2012.
|
[4] |
Alperin J P, Gomez C J, Haustein S. Identifying Diffusion Patterns of Research Articles on Twitter: A Case Study of Online Engagement with Open Access Articles[J]. Public Understanding of Science, 2019,28(1):2-18.
doi: 10.1177/0963662518761733
pmid: 29607775
|
[5] |
Zhang L W, Wang J. Why Highly Cited Articles are not Highly Tweeted? A Biology Case[J]. Scientometrics, 2018,117(1):495-509.
doi: 10.1007/s11192-018-2876-6
|
[6] |
Lucassen T, Schraagen J M. Factual Accuracy and Trust in Information: The Role of Expertise[J]. Journal of the American Society for Information Science and Technology, 2011,62(7):1232-1242.
doi: 10.1002/asi.21545
|
[7] |
Petersen A M, Vincent E M, Westerling A L R. Discrepancy in Scientific Authority and Media Visibility of Climate Change Scientists and Contrarians[J]. Nature Communications, 2019,10(1):1-14.
doi: 10.1038/s41467-018-07882-8
pmid: 30602773
|
[8] |
Shu F, Lou W, Haustein S. Can Twitter Increase the Visibility of Chinese Publications?[J]. Scientometrics, 2018,116(1):505-519.
doi: 10.1007/s11192-018-2732-8
|
[9] |
Thelwall M, Sud P. Mendeley Readership Counts: An Investigation of Temporal and Disciplinary Differences[J]. Journal of the Association for Information Science and Technology, 2016,67(12):3036-3050.
doi: 10.1002/asi.2016.67.issue-12
|
[10] |
Eldakar M A M. Who Reads International Egyptian Academic Articles? An Altmetrics Analysis of Mendeley Readership Categories[J]. Scientometrics, 2019,121(1):105-135.
doi: 10.1007/s11192-019-03189-7
|
[11] |
Holmberg K, Vainio J. Why do Some Research Articles Receive More Online Attention and Higher Altmetrics? Reasons for Online Success According to the Authors[J]. Scientometrics, 2018,116(1):435-447.
doi: 10.1007/s11192-018-2710-1
|
[12] |
Tahamtan I, Safipour Afshar A, Ahamdzadeh K. Factors Affecting Number of Citations: A Comprehensive Review of the Literature[J]. Scientometrics, 2016,107(3):1195-1225.
doi: 10.1007/s11192-016-1889-2
|
[13] |
Xie J, Gong K L, Li J, et al. A Probe into 66 Factors which are Possibly Associated with the Number of Citations an Article Received[J]. Scientometrics, 2019,119(3):1429-1454.
|
[14] |
Xie J, Gong K L, Cheng Y, et al. The Correlation between Paper Length and Citations: A Meta-analysis[J]. Scientometrics, 2019,118(3):763-786.
doi: 10.1007/s11192-019-03015-0
|
[15] |
Rostami F, Mohammadpoorasl A, Hajizadeh M. The Effect of Characteristics of Title on Citation Rates of Articles[J]. Scientometrics, 2014,98(3):2007-2010.
doi: 10.1007/s11192-013-1118-1
|
[16] |
Mingers J, Xu F. The Drivers of Citations in Management Science Journals[J]. European Journal of Operational Research, 2010,205(2):422-430.
doi: 10.1016/j.ejor.2009.12.008
|
[17] |
Yan E, Wu C J, Song M. The Funding Factor: A Cross-disciplinary Examination of the Association Between Research Funding and Citation Impact[J]. Scientometrics, 2018,115(1):369-384.
doi: 10.1007/s11192-017-2583-8
|
[18] |
Craig I D, Plume A M, McVeigh M E, et al. Do Open Access Articles Have Greater Citation Impact?: A Critical Review of the Literature[J]. Journal of Informetrics, 2007,1(3):239-248.
doi: 10.1016/j.joi.2007.04.001
|
[19] |
Chen C M. Predictive Effects of Structural Variation on Citation Counts[J]. Journal of the American Society for Information Science and Technology, 2012,63(3):431-449.
doi: 10.1002/asi.21694
|
[20] |
Willis D L, Bahler C D, Neuberger M M, et al. Predictors of Citations in the Urological Literature[J]. BJU International, 2011,107(12):1876-1880.
doi: 10.1111/j.1464-410X.2010.10028.x
pmid: 21332629
|
[21] |
Hurley L A, Ogier A L, Torvik V I. Deconstructing the Collaborative Impact: Article and Author Characteristics that Influence Citation Count[J]. Proceedings of the American Society for Information Science and Technology, 2013,50(1):1-10.
|
[22] |
Franceschet M, Costantini A. The Effect of Scholar Collaboration on Impact and Quality of Academic Papers[J]. Journal of Informetrics, 2010,4(4):540-553.
|
[23] |
Roldan-Valadez E, Rios C. Alternative Bibliometrics from Impact Factor Improved the Esteem of a Journal in a 2-year-ahead Annual-citation Calculation[J]. European Journal of Gastroenterology & Hepatology, 2015,27(2):115-122.
doi: 10.1097/MEG.0000000000000253
pmid: 25533428
|
[24] |
Diekhoff T, Schlattmann P, Dewey M. Impact of Article Language in Multi-language Medical Journals-a Bibliometric Analysis of Self-citations and Impact Factor[J]. PLoS One, 2013,8(10):e76816.
doi: 10.1371/journal.pone.0076816
pmid: 24146929
|
[25] |
Winnik S, Raptis D A, Walker J H, et al. From Abstract to Impact in Cardiovascular Research: Factors Predicting Publication and Citation[J]. European Heart Journal, 2012,33(24):3034-3045.
doi: 10.1093/eurheartj/ehs113
pmid: 22669850
|
[26] |
Ringelhan S, Wollersheim J, Welpe I M. I Like, I Cite? Do Facebook Likes Predict the Impact of Scientific Work?[J]. PLoS One, 2015,10(8):e0134389.
doi: 10.1371/journal.pone.0134389
pmid: 26244779
|
[27] |
吴朋民, 陈挺, 王小梅. Altmetrics 与引文指标相关性研究[J]. 数据分析与知识发现, 2018,2(6):58-69.
|
[27] |
( Wu Pengmin, Chen Ting, Wang Xiaomei. The Correlation Between Altmetrics and Citations[J]. Data Analysis and Knowledge Discovery, 2018,2(6):58-69.)
|
[28] |
Abrishami A, Aliakbary S. Predicting Citation Counts Based on Deep Neural Network Learning Techniques[J]. Journal of Informetrics, 2019,13(2):485-499.
|
[29] |
Bai X M, Zhang F L, Lee I. Predicting the Citations of Scholarly Paper[J]. Journal of Informetrics, 2019,13(1):407-418.
|
[30] |
Yu T, Yu G, Li P Y, et al. Citation Impact Prediction for Scientific Papers Using Stepwise Regression Analysis[J]. Scientometrics, 2014,101(2):1233-1252.
|
[31] |
Cao X, Chen Y, Liu K J R, A Data Analytic Approach to Quantifying Scientific Impact[J]. Journal of Informetrics, 2016,10(2):471-484.
|
[32] |
Singh M, Jaiswal A, Shree P, et al. Understanding the Impact of Early Citers on Long-term Scientific Impact[C]// Proceedings of the 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL). 2017: 1-10.
|
[33] |
Sarigöl E, Pfitzner R, Scholtes I, et al. Predicting Scientific Success Based on Coauthorship Networks[C]// EPJ Data Science 2014, 44 3(1): Article No. 9.
|
[34] |
Pobiedina N, Ichise R. Citation Count Prediction as a Link Prediction Problem[J]. Applied Intelligence, 2016,44(2):252-268.
|
[35] |
耿骞, 景然, 靳健, 等. 学术论文引用预测及影响因素分析[J]. 图书情报工作, 2018,62(14):29-40.
|
[35] |
( Geng Qian, Jing Ran, Jin Jian, et al. Citation Prediction and Influencing Factors Analysis on Academic Papers[J]. Library and Information Service, 2018,62(14):29-40.)
|
[36] |
Robson B J, Mousquès A. Can We Predict Citation Counts of Environmental Modelling Papers? Fourteen Bibliographic and Categorical Variables Predict Less than 30% of the Variability in Citation Counts[J]. Environmental Modelling & Software, 2016,75:94-104.
|
[37] |
Sohrabi B, Iraj H. The Effect of Keyword Repetition in Abstract and Keyword Frequency per Journal in Predicting Citation Counts[J]. Scientometrics, 2017,110(1):243-251.
|
[38] |
Chen J P, Zhang C X. Predicting Citation Counts of Papers[C]// Proceedings of 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC). 2015: 434-440.
|
[39] |
Waltman L, Van Eck N J, Van Raan A F J. Universality of Citation Distributions Revisited[J]. Journal of the American Society for Information Science and Technology, 2012,63(1):72-77.
|
[40] |
Eom Y H, Fortunato S. Characterizing and Modeling Citation Dynamics[J]. PLoS One, 2011,6(9):e24926.
doi: 10.1371/journal.pone.0024926
pmid: 21966387
|
[41] |
Lv L Y, Zhou T. Link Prediction in Complex Networks: A Survey[J]. Physica A: Statistical Mechanics and Its Applications, 2011,390(6):1150-1170.
|
[42] |
张斌, 李亚婷. 学科合作网络链路预测结果的排序鲁棒性[J]. 信息资源管理学报, 2018,8(4):89-97.
|
[42] |
( Zhang Bin, Li Yating. Ranking Robustness of Link Prediction Results in Disciplinary Collaboration Network[J]. Journal of Information Resources Management, 2018,8(4):89-97.)
|
[43] |
Hirsch J E. An Index to Quantify an Individual’s Scientific Research Output[J]. Proceedings of the National Academy of Sciences, 2005,102(46):16569-16572.
|
[44] |
Sinatra R, Wang D S, Deville P, et al. Quantifying the Evolution of Individual Scientific Impact[J]. Science, 2016, 354(6312):aaf5239.
doi: 10.1126/science.aaf5239
pmid: 27811240
|
[45] |
Information and Documentation — Guidelines for Bibliographic References and Citations to Information Resources:2010 [S/OL]. [2010-06-15]. https://www.iso.org/standard/72642.html.
|
[46] |
Kohavi R. A Study of Cross-validation and Bootstrap for Accuracy Estimation and Model Selection[C]// Proceedings of the 14th International Joint Conference on Artificial Intelligence-Volume 2. 1995: 1137-1143.
|
[47] |
WHO. The Top 10 Causes of Death[R/OL].[2018-05-24].https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death.
|
[48] |
Bethesda (MD): National Center for Biotechnology Information. PubMed Help[EB/OL].[2019-07-25]. https://www.ncbi.nlm.nih.gov/books/NBK3827.
|
[49] |
Haustein S, Sugimoto C R, Larivière V, et al. The Thematic Orientation of Publications Mentioned on Social Media[J]. Aslib Journal of Information Management, 2015,67(3):260-288.
|
[50] |
王睿, 胡文静, 郭玮, 等. 高Altmetrics指标科技论文学术影响力研究[J]. 图书情报工作, 2014,58(21):92-98.
|
[50] |
( Wang Rui, Hu WenJing, Guo Wei, et al. Research on Academic Influence of High Altmetrics Sci-tech Papers[J]. Library and Information Service, 2014,58(21):92-98.)
|
[51] |
Altmetric. What Outputs and Sources does Altmetric Track?[EB/OL]. [2019-07-25]. https://help.altmetric.com/support/solutions/articles/6000060968-what-data-sources-does-altmetric-track-.
|
[52] |
方志超, 王贤文. 科学论文首条推特的积累速度与用户类型分析[J]. 图书情报知识, 2019(2):28-38.
|
[52] |
( Fang Zhichao, Wang Xianwen. Study on the Accumulation Speed and User Type of Scientific Publications’ First Tweets[J]. Documentation, Information & Knowledge, 2019(2):28-38.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|