Predicting Popularity of Emerging Topics with Multivariable LSTM and Bibliometric Indicators
Chen Wen1,2,Chen Wei1,2,3()
1Wuhan Documentation and Information Center, Chinese Academy of Sciences, Wuhan 430071, China 2Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China 3Hubei Key Laboratory of Big Data in Science and Technology, Wuhan 430071, China
[Objective] This paper identifies emerging topics from multi-source data, and constructs a multivariable LSTM with bibliometric indicators to predict their popularity. [Methods] Firstly, we explored the topics of funded projects, papers and patents. Secondly, we identified the emerging ones based on their novelty, growth and persistence. Finally, we predicted these topics’ popularity with the multivariable LSTM model and indicators of funding amounts, number of fundings, average citation counts for each article, and number of patent IPC subclasses. [Results] We examined our new model with studies on solid oxide fuel cell, which yielded better performance than BP, KNN, SVM and univariate LSTM. Our model had the lowest MAE (16.534) and RMSE (23.494), as well as the highest R2 (0.642). [Limitations] We did not include each patent’s citation number because it was difficult to obtain specific data for each time window. [Conclusions] The modified LSMT could effectively predict the popularity of emerging topics.
(Zhou Yunze, Min Chao. Identifying Emerging Technology with LDA Model and Shared Semantic Space——Case Study of Autonomous Vehicles[J]. Data Analysis and Knowledge Discovery, 2022, 6(2/3): 55-66.)
[2]
Matsumura N, Matsuo Y, Ohsawa Y, et al. Discovering Emerging Topics from WWW[J]. Journal of Contingencies and Crisis Management, 2002, 10(2): 73-81.
doi: 10.1111/1468-5973.00183
[3]
Zhang S T, Han F. Identifying Emerging Topics in a Technological Domain[J]. Journal of Intelligent & Fuzzy Systems, 2016, 31(4): 2147-2157.
[4]
Tu Y N, Seng J L. Indices of Novelty for Emerging Topic Detection[J]. Information Processing & Management, 2012, 48(2): 303-325.
doi: 10.1016/j.ipm.2011.07.006
[5]
Glänzel W, Thijs B. Using ‘Core Documents’ for Detecting and Labelling New Emerging Topics[J]. Scientometrics, 2012, 91(2): 399-416.
doi: 10.1007/s11192-011-0591-7
[6]
Rotolo D, Hicks D, Martin B R. What is an Emerging Technology?[J]. Research Policy, 2015, 44(10): 1827-1843.
doi: 10.1016/j.respol.2015.06.006
(Bai Jingyi, Yan Ruiwu, Chen Qiong. Trend Prediction of Emerging Topics Based on Topic Model and Curve Fitting[J]. Information Studies: Theory & Application, 2020, 43(7): 130-136, 193.)
(Yang Jinqing, Wei Yuhan, Huang Shengzhi, et al. Research Review on Emerging Topic Identification Based on Scientific Literatures[J]. Information Science, 2020, 38(8): 159-163, 177.)
(Huang Lucheng, Tang Yueqiang, Wu Feifei, et al. Research on Identification of Emerging Topics Based on Muti-Attribute Measurement of Literature[J]. Science of Science and Management of S.& T., 2015, 36(2): 34-43.)
(Song Xinna, Guo Ying, Xi Xiaowen. Research on Multi-Indicator Emerging Technology Identification Based on Patent Literature[J]. Journal of Intelligence, 2020, 39(6): 76-81, 88.)
(Liu Xiaoping, Leng Fuhai, Li Zexia. Methods and Approaches of International S&T Front Analysis[J]. Library and Information Service, 2012, 56(12): 60-65.)
(Zhang Jing, Liu Yanjun, Zhang Wei, et al. Empirical Exploration on Effective Paths to Identify Frontier Tech Based upon Data of Scientific Research Projects[J]. Science and Technology Management Research, 2019, 39(16): 108-119.)
(Zeng Haijiao, Sun Wei. Identification of Potential Scientific Frontiers Based on Correlation Between Patents and Papers——A Case Study of Biopesticide[J]. Agricultural Outlook, 2020, 16(9): 93-100.)
(Xu Lulu, Wang Fang. Scientific Frontier Prediction Model Based on Support Vector Machine and Improved Particle Swarm Optimization[J]. Information Science, 2019, 37(8): 22-28.)
(Bai Rujiang, Liu Bowen, Leng Fuhai. Frontier Identification of Emerging Scientific Research Based on Multi-indicators[J]. Journal of the China Society for Scientific and Technical Information, 2020, 39(7): 747-760.)
(Song Kai, Zhu Yanjun. Patent Frontier Technology Topic Identification and Trend Prediction: A Case Analysis of Artificial Intelligence[J]. Journal of Intelligence, 2021, 40(1): 33-38.)
(Yue Lixin, Liu Ziqiang, Hu Zhengyin. Evolution Analysis of Hot Topics with Trend-Prediction[J]. Data Analysis and Knowledge Discovery, 2020, 4(6): 22-34.)
(Li Jing, Xu Lulu, Zhao Sujun. Prediction and Visualization of Emerging Topics of Fund Sponsored Projects Based on Time Series Analysis and SVM Model[J]. Information Studies: Theory & Application, 2019, 42(1): 118-123, 152.)
(Huo Chaoguang, Dong Ke, Si Xiangyun. Evolution Analysis and Prediction of Scientific Topic Popularity in the Field of LIS[J]. Documentation, Information & Knowledge, 2021(2): 35-47, 57.)
(Huo Chaoguang, Huo Fanfan, Dong Ke. The Popularity Prediction of Scientific Topics Based on LSTM[J]. Documentation, Information & Knowledge, 2021(2): 25-34.)
[21]
Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
(Fan Shaoping, An Xinying, Yan Guilai, et al. Study on the Recognition Method of Frontier Topic in the Medical Field[J]. Journal of the China Society for Scientific and Technical Information, 2018, 37(7): 686-694.)
[24]
Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
pmid: 9377276
[25]
Porter A L, Garner J, Carley S F, et al. Emergence Scoring to Identify Frontier R&D Topics and Key Players[J]. Technological Forecasting and Social Change, 2019, 146: 628-643.
doi: 10.1016/j.techfore.2018.04.016
(Wang Xiaoyue, Liu Ziqiang, Bai Rujiang, et al. The Method of Research Front Topic Detection Based on the Fund Project Data[J]. Library and Information Service, 2017, 61(13): 87-98.)
doi: 10.13266/j.issn.0252-3116.2017.13.011
(Zhu Guang, Liu Lei, Li Fengjing. Research on Topic Relation and Prediction Based on LDA and LSTM——A Case Study of Privacy Research[J]. Journal of Modern Information, 2020, 40(8): 38-50.)
doi: 10.3969/j.issn.1008-0821.2020.08.005
(Wang Weijiao, Chen Li, Wang Yaqiang, et al. Algorithm for Prediction of Post’s Hotness Using K-Nearest Neighbors and Latent Dirichlet Allocation[J]. Journal of Sichuan University(Natural Science Edition), 2014, 51(3): 467-473.)
[29]
Shi H G, Su C, Ran R, et al. Electrolyte Materials for Intermediate-Temperature Solid Oxide Fuel Cells[J]. Progress in Natural Science: Materials International, 2020, 30(6): 764-774.
doi: 10.1016/j.pnsc.2020.09.003
[30]
Bello I T, Zhai S, Zhao S Y, et al. Scientometric Review of Proton-Conducting Solid Oxide Fuel Cells[J]. International Journal of Hydrogen Energy, 2021, 46(75): 37406-37428.
doi: 10.1016/j.ijhydene.2021.09.061
[31]
Bello I T, Zhai S, He Q J, et al. Scientometric Review of Advancements in the Development of High-Performance Cathode for Low and Intermediate Temperature Solid Oxide Fuel Cells: Three Decades in Retrospect[J]. International Journal of Hydrogen Energy, 2021, 46(52): 26518-26536.
doi: 10.1016/j.ijhydene.2021.05.134
[32]
Singh M, Zappa D, Comini E. Solid Oxide Fuel Cell: Decade of Progress, Future Perspectives and Challenges[J]. International Journal of Hydrogen Energy, 2021, 46(54): 27643-27674.
doi: 10.1016/j.ijhydene.2021.06.020