Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (3): 18-25    DOI: 10.11925/infotech.1003-3513.2015.03.03
Current Issue | Archive | Adv Search |
Topic Evolution Research on a Certain Field Based on LDA Topic Association Filter
Qin Xiaohui1,2, Le Xiaoqiu1
1 National Science Library, Chinese Academy of Sciences, Beijing 100190, China;
2 University of Chinese Academy of Sciences, Beijing 100049, China
Download: PDF(558 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] To detect the birth, extinction, development, merge and split of topic evolution of the literatures in a certain field. [Methods] This paper divides time windows according to the publication data of the literatures, and LDA model is applied to extract topics from each time window automatically. The topic association filter rules are used to determine evolution relationships between topics in adjacent time windows. Form a topic evolution path in a continuous time period. [Results] Considering the continuity of the topics, different types of topic evolution could be detected with high accuracy. [Limitations] This method fixes the size of time windows without considering the diversity of topic evolution cycles. [Conclusions] This method can effectively reduce the interference of topics with smaller similarity in LDA, and enhance accuracy of evolution relation recognition.

Key wordsTopic association      Topic evolution      Topic model      LDA     
Received: 08 October 2014      Published: 16 April 2015
:  TP393  

Cite this article:

Qin Xiaohui, Le Xiaoqiu. Topic Evolution Research on a Certain Field Based on LDA Topic Association Filter. New Technology of Library and Information Service, 2015, 31(3): 18-25.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.03.03     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I3/18

[1] 李勇, 安新颖. 基于LDA的主题演化研究[J]. 医学信息学杂志, 2013, 34(2): 57-61. (Li Yong, An Xinying. Research on Topic Evolution Based on LDA [J]. Journal of Medical Informatics, 2013, 34(2): 57-61.)
[2] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet Allocation [J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
[3] 楚克明, 李芳. 基于LDA模型的新闻话题的演化[J]. 计算机应用与软件, 2011, 28(4): 4-7, 26.( Chu Keming, Li Fang. LDA Model-based News Topic Evolution [J]. Computer Applications and Software, 2011, 28(4): 4-7, 26.)
[4] 楚克明. 基于LDA的新闻话题演化研究[D]. 上海: 上海交通大学, 2010.(Chu Keming. The Reaearch on Topic Evolution for News Based on LDA Model [D]. Shanghai: Shanghai Jiaotong University, 2010.)
[5] 李保利, 杨星. 基于LDA模型和话题过滤的研究主题演化分析[J]. 小型微型计算机系统, 2012, 33(12): 2738-2743. (Li Baoli, Yang Xing. Analyzing Research Topic Evolution with LDA and Topic Filtering [J]. Journal of Chinese Computer Systems, 2012, 33(12): 2738-2743.)
[6] 崔凯, 周斌, 贾焰, 等.一种基于LDA的在线主题演化挖掘模型[J]. 计算机科学, 2010, 37(11): 156-159, 193. (Cui Kai, Zhou Bin, Jia Yan, et al. LDA-based Model for Online Topic Evolution Mining [J]. Computer Science, 2010, 37(11): 156-159, 193.)
[7] 胡吉明, 陈果. 基于动态LDA主题模型的内容主题挖掘与演化[J]. 图书情报工作, 2014, 58(2): 138-142. (Hu Jiming, Chen Guo. Mining and Eolution of Content Topics Based on Dynamic LDA [J]. Library and Information Service, 2014, 58(2): 138-142.)
[8] Lv N, Luo J, Liu Y, et al. Analysis of Topic Evolution Based on Subtopic Similarity [C]. In: Proceedings of the 2009 International Conference on Computational Intelligence and Natural Computing, 2009, 2: 506-509.
[9] 胡艳丽, 白亮, 张维明. 一种话题演化建模与分析方法[J]. 自动化学报, 2012, 38(10): 1690-1697. (Hu Yanli, Bai Liang, Zhang Weiming. Modeling and Analyzing Topic Evolution [J]. Acta Automatic Sinica, 2012, 38(10): 1690-1697.)
[10] Blei D M, Lafferty J D. Dynamic Topic Models [C]. In: Proceedings of the 23rd International Conference on Machine Learning. 2006: 113-120.
[11] Alsumait L, Barbara D, Domeniconi C. On-line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking [C]. In: Proceeding of the 8th IEEE International Conference on Data Mining. IEEE, 2008: 3-12.
[12] Wang X, McCallum A. Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends [C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006: 424-433.
[13] 贺亮, 李芳.科技文献话题演化研究[J]. 现代图书情报技术, 2012(4): 61-67. (He Liang, Li Fang. Topic Evolution in Scientific Literature [J]. New Technology of Library and Information Service, 2012(4): 61-67.)
[14] 范云满, 马建霞. 利用LDA的领域新兴主题探测技术综述[J]. 现代图书情报技术, 2012(12): 58-65. (Fan Yunman, Ma Jianxia. Review on the LDA-based Techniques Detection for the Field Emerging Topic [J]. New Technology of Library and Information Service, 2012(12): 58-65.)
[15] 唐晓波, 王洪艳. 基于潜在狄利克雷分配模型的微博主题演化分析[J]. 情报学报, 2013, 32(3): 281-287. (Tang Xiaobo, Wang Hongyan. Analysis of Microblog Topic Evolution Based on Latent Dirichlet Allocation Model [J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(3): 281-287.)
[16] 史庆伟, 乔晓东, 徐硕, 等.作者主题演化模型及其在研究兴趣演化分析中的应用[J]. 情报学报, 2013, 32(9): 912-919. (Shi Qingwei, Qiao Xiaodong, Xu Shuo, et al. Author-topic Evolution Model and Its Application in Analysis of Research Interests Evolution [J]. Journal of the China Society for Scientific and Technical Information, 2013, 32(9): 912-919.)
[17] Xu S, Shi Q, Qiao X, et al. Author-topic over Time (AToT): A Dynamic Users' Interest Model [A].// Mobile, Ubiquitous, and Intelligent Computing [M]. Springer Berlin Heidelberg, 2014: 239-245.
[18] 单斌, 李芳. 基于LDA话题演化研究方法综述[J]. 中文信息学报, 2010, 24(6): 43-49, 68. (Shan Bin, Li Fang. A Survey of Topic Evolution Based on LDA [J]. Journal of Chinese Information Processing, 2010, 24(6): 43-49, 68.)
[19] Wei X, Sun J, Wang X. Dynamic Mixture Models for Multiple Timeseries [C]. In: Proceedings of the 20th International Joint Conference on Artificial Intelligent, Hyderabad, India. 2007: 2909-2914.
[20] Griffiths T L, Steyvers M. Finding Scientific Topics [C]. In: Proceedings of the National Academy of Sciences of the United States of America. 2004: 5228-5235.
[21] Manning C D, Schütze H, Raghavan P. 信息检索导论[M]. 王斌译. 北京: 人民邮电出版社, 2011. (Manning C D, Schütze H, Raghavan P. Introduction to Information Retrieval [M]. Translated by Wang Bin. Beijing: Post & Telecom Press, 2011.)
[22] National Cancer Institute. NCI Thesaurus Hierarchy [EB/OL]. [2014-02-14]. http://ncim.nci.nih.gov/ncimbrowser/pages/source_ hierarchy.jsf?&sab=NCI.

[1] Lixin Xia,Jieyan Zeng,Chongwu Bi,Guanghui Ye. Identifying Hierarchy Evolution of User Interests with LDA Topic Model[J]. 数据分析与知识发现, 2019, 3(7): 1-13.
[2] Qingtian Zeng,Xiaohui Hu,Chao Li. Extracting Keywords with Topic Embedding and Network Structure Analysis[J]. 数据分析与知识发现, 2019, 3(7): 52-60.
[3] Peng Guan,Yuefen Wang,Zhu Fu. Analyzing Topic Semantic Evolution with LDA: Case Study of Lithium Ion Batteries[J]. 数据分析与知识发现, 2019, 3(7): 61-72.
[4] Bengong Yu,Yangnan Chen,Ying Yang. Classifying Short Text Complaints with nBD-SVM Model[J]. 数据分析与知识发现, 2019, 3(5): 77-85.
[5] Peiyao Zhang,Dongsu Liu. Topic Evolutionary Analysis of Short Text Based on Word Vector and BTM[J]. 数据分析与知识发现, 2019, 3(3): 95-101.
[6] Linna Xi,Yongxiang Dou. Examining Reposts of Micro-bloggers with Planned Behavior Theory[J]. 数据分析与知识发现, 2019, 3(2): 13-20.
[7] Hongqinling Wang,Zhichao Ba,Gang Li. Conversational Topic Intensity Calculation and Evolution Analysis of WeChat Group[J]. 数据分析与知识发现, 2019, 3(2): 33-42.
[8] Jie Zhang,Junbo Zhao,Dongsheng Zhai,Ningning Sun. Patent Technology Analysis of Microalgae Biofuel Industrial Chain Based on Topic Model[J]. 数据分析与知识发现, 2019, 3(2): 52-64.
[9] Junwan Liu,Zhixin Long,Feifei Wang. Finding Collaboration Opportunities from Emerging Issues with LDA Topic Model and Link Prediction[J]. 数据分析与知识发现, 2019, 3(1): 104-117.
[10] Guijun Yang,Xue Xu,Fuqiang Zhao. Predicting User Ratings with XGBoost Algorithm[J]. 数据分析与知识发现, 2019, 3(1): 118-126.
[11] Yuemei Xu,Sining Lv,Lianqiao Cai,Xiaoya Zhang. Analyzing News Topic Evolution with Convolutional Neural Networks and Topic2Vec[J]. 数据分析与知识发现, 2018, 2(9): 31-41.
[12] Yue He,Yue Feng,Shupeng Zhao,Yufeng Ma. Recommending Contents Based on Zhihu Q&A Community: Case Study of Logistics Topics[J]. 数据分析与知识发现, 2018, 2(9): 42-49.
[13] Tao Zhang,Haiqun Ma. Clustering Policy Texts Based on LDA Topic Model[J]. 数据分析与知识发现, 2018, 2(9): 59-65.
[14] Yanhua Xu,Yujie Miao,Lin Miao,Xueqiang Lv. Generating HSK Writing Essays with LDA Model[J]. 数据分析与知识发现, 2018, 2(9): 80-87.
[15] Ziming Zeng,Qianwen Yang. Sentiment Analysis for Micro-blogs with LDA and AdaBoost[J]. 数据分析与知识发现, 2018, 2(8): 51-59.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn