|
|
Subject Topic Mining and Evolution Analysis with Multi-Source Data |
Li Hui(),Hu Jixia,Tong Zhiying |
School of Economics and Management, Xidian University, Xi’an 710126, China |
|
|
Abstract [Objective] This paper examines the evolution of research topics, which helps researchers quickly identify the status quo and trends in their fields. [Methods] First, we merged multi-source datasets and divided the domain research topics by time period. Then, we calculated topic importance with their popularity, density, and closeness centrality. Third, we utilized topic semantic similarity to identify the related ones from adjacent time periods. Finally, we combined the topic importance fluctuation and the topic similarity to decide their evolution types and paths. [Results] We examined our model with papers on artificial intelligence and analyzed the changes of topics in the past 20 years. We identified the popular research topics and their evolution paths, which showed obvious thematic fusion and split development in four periods. [Limitations] The topic naming rules could be more effective and we could not show the whole life cycle of the booming artificial intelligence research. [Conclusions] The proposed model could effectively reveal the topic evolution of research.
|
Received: 13 November 2021
Published: 24 August 2022
|
|
Fund:National Natural Science Foundation of China(71203173) |
Corresponding Authors:
Li Hui,ORCID:0000-0002-3468-5170
E-mail: lihui@xidian.edu.cn
|
[1] |
王春秀, 冉美丽. 学科主题演化定量分析的理论基础探析[J]. 现代情报, 2008, 28(6): 48-50.
|
[1] |
( Wang Chunxiu, Ran Meili. Theory Foundation Discussion About Quantitative Analysis of Subjects Theme Evaluation[J]. Modern Information, 2008, 28(6): 48-50.)
|
[2] |
Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3(4-5): 993-1022.
|
[3] |
Wu Q Q, Kuang Y C, Hong Q Q, et al. Frontier Knowledge Discovery and Visualization in Cancer Field Based on KOS and LDA[J]. Scientometrics, 2019, 118(3): 979-1010.
doi: 10.1007/s11192-018-2989-y
|
[4] |
丰米宁, 魏凤, 李健, 等. 产业链视角下的主题识别与技术演化研究——以3D打印领域为例[J]. 情报杂志, 2020, 39(8): 46-52.
|
[4] |
( Feng Mining, Wei Feng, Li Jian, et al. Research on Topic Identification and Technology Evolution from the Perspective of Industrial Chain—A Case Study of 3D-Printing[J]. Journal of Intelligence, 2020, 39(8): 46-52.)
|
[5] |
李湘东, 张娇, 袁满. 基于LDA模型的科技期刊主题演化研究[J]. 情报杂志, 2014, 33(7): 115-121.
|
[5] |
( Li Xiangdong, Zhang Jiao, Yuan Man. On Topic Evolution of a Scientific Journal Based on LDA Model[J]. Journal of Intelligence, 2014, 33(7): 115-121.)
|
[6] |
Jeong Y, Park I, Yoon B. Identifying Emerging Research and Business Development(R&BD) Areas Based on Topic Modeling and Visualization with Intellectual Property Right Data[J]. Technological Forecasting and Social Change, 2019, 146: 655-672.
doi: 10.1016/j.techfore.2018.05.010
|
[7] |
岳丽欣, 刘自强, 胡正银. 面向趋势预测的热点主题演化分析方法研究[J]. 数据分析与知识发现, 2020, 4(6): 22-34.
|
[7] |
( Yue Lixin, Liu Ziqiang, Hu Zhengyin. Evolution Analysis of Hot Topics with Trend-Prediction[J]. Data Analysis and Knowledge Discovery, 2020, 4(6): 22-34.)
|
[8] |
茅利锋. 基于主题模型的主题演化分析及预测[D]. 南京: 南京邮电大学, 2016.
|
[8] |
( Mao Lifeng. Study of Text Evolution Analysis and Prediction Based on Topic Model[D]. Nanjing: Nanjing University of Posts and Telecommunications, 2016.)
|
[9] |
Chen J F, Yu J J, Shen Y. Towards Topic Trend Prediction on a Topic Evolution Model with Social Connection[C]// Proceedings of the 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology. IEEE, 2012: 153-157.
|
[10] |
何建云, 陈兴蜀, 杜敏, 等. 基于改进的在线LDA模型的主题演化分析[J]. 中南大学学报(自然科学版), 2015, 46(2): 547-553.
|
[10] |
He Jianyun, Chen Xingshu, Du Min, et al. Topic Evolution Analysis Based on Improved Online LDA Model[J]. Journal of Central South University(Science and Technology), 2015, 46(2): 547-553.)
|
[11] |
Wang J, Wu X, Li L. Semantic Connection Based Topic Evolution[C]// Proceedings of the 31st AAAI Conference on Artificial Intelligence. AAAI Press, 2017.
|
[12] |
Wei W, Guo C H, Chen J F, et al. Textual Topic Evolution Analysis Based on Term Co-Occurrence: A Case Study on the Government Work Report of the State Council(1954-2017)[C]// Proceedings of the 12th International Conference on Intelligent Systems and Knowledge Engineering(ISKE). IEEE, 2017: 1-6.
|
[13] |
朱茂然, 王奕磊, 高松, 等. 基于LDA模型的主题演化分析: 以情报学文献为例[J]. 北京工业大学学报, 2018, 44(7): 1047-1053.
|
[13] |
( Zhu Maoran, Wang Yilei, Gao Song, et al. Evolution of Topic Using LDA Model: Evidence from Information Science Journals[J]. Journal of Beijing University of Technology, 2018, 44(7): 1047-1053.)
|
[14] |
曾利, 李自力, 谭跃进. 基于动态LDA的科研文献主题演化分析[J]. 软件, 2014, 35(5): 102-107.
|
[14] |
( Zeng Li, Li Zili, Tan Yuejin. Analysis of Topic Evolution in Scientific Literature Based on Dynamic Latent Dirichlet Allocation[J]. Software, 2014, 35(5): 102-107.)
|
[15] |
戴长松, 王永滨, 王琦. 基于在线主题模型的新闻热点演化模型分析[J]. 软件导刊, 2020, 19(1): 84-88.
|
[15] |
( Dai Changsong, Wang Yongbin, Wang Qi. Analysis of News Hotspot Evolution Model Based on Online Topic Model[J]. Software Guide, 2020, 19(1): 84-88.)
|
[16] |
Gao W, Peng M, Wang H, et al. Generation of Topic Evolution Graphs from Short Text Streams[J]. Neurocomputing, 2020, 383: 282-294.
doi: 10.1016/j.neucom.2019.11.077
|
[17] |
Li Z F, Yin Z X, Li Q Q. Study on Topic Intensity Evolution Law of Web News Topic Based on Topic Content Evolution[C]// Proceedings of the 4th International Conference on Cloud Computing and Security. Springer, 2018: 697-709.
|
[18] |
岳丽欣, 周晓英, 陈旖旎. 期刊论文核心研究主题识别及其演化路径可视化方法研究——以我国医疗健康信息领域期刊论文为例[J]. 图书情报工作, 2020, 64(5): 89-99.
doi: 10.13266/j.issn.0252-3116.2020.05.010
|
[18] |
( Yue Lixin, Zhou Xiaoying, Chen Yini. Research on Topic Identification of Papers Core Research Subjects and Evolution Path Visualization Method—Taking China’s Journal of Medical and Health Information as an Example[J]. Library and Information Service, 2020, 64(5): 89-99.)
doi: 10.13266/j.issn.0252-3116.2020.05.010
|
[19] |
匡广生, 郭岩, 俞晓明, 等. 基于图的多源数据融合框架研究[J]. 计算机科学, 2021, 48(11): 170-175.
|
[19] |
( Kuang Guangsheng, Guo Yan, Yu Xiaoming, et al. Study on Multi-Source Data Fusion Framework Based on Graph[J]. Computer Science, 2021, 48(11): 170-175.)
|
[20] |
许海云, 董坤, 隗玲, 等. 科学计量中多源数据融合方法研究述评[J]. 情报学报, 2018, 37(3): 318-328.
|
[20] |
( Xu Haiyun, Dong Kun, Wei Ling, et al. Research on Multi-Source Data Fusion Method in Scientometrics[J]. Journal of the China Society for Scientific and Technical Information, 2018, 37(3): 318-328.)
|
[21] |
徐路路, 王芳. 基于支持向量机和改进粒子群算法的科学前沿预测模型研究[J]. 情报科学, 2019, 37(8): 22-28.
|
[21] |
( Xu Lulu, Wang Fang. Scientific Frontier Prediction Model Based on Support Vector Machine and Improved Particle Swarm Optimization[J]. Information Science, 2019, 37(8): 22-28.)
|
[22] |
See A, Liu P J, Manning C D. Get to the Point: Summarization with Pointer-Generator Networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1073-1083.
|
[23] |
李慧, 孟玮. 专利视角下的美国空军核心技术演化分析[J]. 情报理论与实践, 2021, 44(2): 41-49.
|
[23] |
( Li Hui, Meng Wei. An Analysis of the Evolution of Core Technologies in the USAir Force from a Patent Perspective[J]. Information Studies: Theory & Application, 2021, 44(2): 41-49.)
|
[24] |
İlhan N, Öğüdücü Ş G. Predicting Community Evolution Based on Time Series Modeling[C]// Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 2015: 1509-1516.
|
[25] |
徐佳俊, 杨飏, 姚天昉, 等. 基于LDA模型的论坛热点话题识别和追踪[J]. 中文信息学报, 2016, 30(1): 43-49.
|
[25] |
( Xu Jiajun, Yang Yang, Yao Tianfang, et al. LDA Based Hot Topic Detection and Tracking for the Forum[J]. Journal of Chinese Information Processing, 2016, 30(1): 43-49.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|