Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (10): 54-64    DOI: 10.11925/infotech.2096-3467.2018.0196
Current Issue | Archive | Adv Search |
Dividing Time Windows of Dynamic Topic Model
Tingting Wang(),Yu Wang,Linjie Qin
Institute of Statistics, Huaqiao University, Xiamen 361021, China
Download: PDF(861 KB)   HTML ( 3
Export: BibTeX | EndNote (RIS)      

[Objective] This paper proposes a Document Influence Model (DIM) based on Dynamic Automatic Time, aiming to solve the time window dividing issue of dynamic topic model. [Methods] Firstly, we processed the text corpora with the traditional LDA model and word vector model. Secondly, we constructed a comprehensive index reflecting the differences between time windows and similarity within the time windows. Finally, we built a new model based on this index and conducted an empirical study with news corpus of the “Belt and Road” International Cooperation Summit Forum. [Results] The proposed model could quickly and effectively divide the time windows, which not only ensured the comparability of the topics under different windows, but also evaluated the influence factors of the document. [Limitations] We built the similarity index of time windows based on the traditional LDA model, which could be improved by the latest LDA models. [Conclusions] The new model is able to divide the time series text effectively, which improves the performance of traditional dynamic topic model.

Key wordsDynamic Topic Model      Adaptive Time Window      DIM      Influence Factor      Text Expansion     
Received: 26 February 2018      Published: 12 November 2018

Cite this article:

Tingting Wang,Yu Wang,Linjie Qin. Dividing Time Windows of Dynamic Topic Model. Data Analysis and Knowledge Discovery, 2018, 2(10): 54-64.

URL:     OR

[1] Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[2] 廖君华, 孙克迎, 钟丽霞. 一种基于时序主题模型的网络热点话题演化分析系统[J]. 图书情报工作, 2013, 57(9): 96-102, 118.
[2] (Liao Junhua, Sun Keying, Zhong Lixia.Study on a Hot Topic Analysis System Based on Time Sliced Topic Model[J]. Library and Information Service, 2003, 57(9): 96-102, 118.)
[3] Wang X R, McCallum A. Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends[C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2006: 424-433.
[4] Ding W, Chen C.Dynamic Topic Detection and Tracking: A Comparison of HDP, C-word, and Cocitation Methods[J]. Journal of the Association for Information Science & Technology, 2014, 65(10): 2084-2097.
[5] Blei D M, Lafferty J D.Dynamic Topic Models[C]// Proceedings of the 23rd International Conference on Machine Learning. ACM, 2006: 113-120.
[6] Derntl M, Günnemann N, Klamma R.A Dynamic Topic Model of Learning Analytics Research[C]// Proceedings of International Symposium on Instrumentation and Measurement, Sensor Network and Automation. IEEE, 2013: 436-439.
[7] Ha T, Beijnon B, Kim S, et al.Examining User Perceptions of Smartwatch Through Dynamic Topic Modeling[J]. Telematics and Informatics, 2017, 34(7): 1262-1273.
[8] 曹丽娜, 唐锡晋. 基于主题模型的BBS话题演化趋势分析[J]. 管理科学学报, 2014, 17(11): 109-121.
[8] (Cao Li’na, Tang Xijin.Trends of BBS Topic Based on Dynamic Topic Model[J]. Journal of Management Sciences in China, 2014, 17(11): 109-121.)
[9] 齐亚双, 祝娜, 翟羽佳. 基于DTM的国内外情报学研究主题热度演化对比研究[J]. 图书情报工作, 2016, 60(16): 99-109.
[9] (Qi Yashuang,Zhu Na,Zhai Yujia.A Comparative Study on Topic Heats Evolution in the Field of Information Science Between the Domestic and Foreign Research Based on DTM[J]. Library and Information Service, 2016, 60(16): 99-109.)
[10] 蒋卓人, 陈燕, 高良才, 等. 一种结合有监督学习的动态主题模型[J]. 北京大学学报: 自然科学版, 2015, 51(2): 367-376.
[10] (Jiang Zhuoren, Chen Yan, Gao Liangcai, et al.A Supervised Dynamic Topic Model[J]. Acta Scientiarum Naturalium Universitatis Pekinensis, 2015, 51(2): 367-376.)
[11] 李超雄, 黄发良, 温肖谦, 等. 基于动态主题情感混合模型的微博主题情感演化分析方法[J]. 计算机应用, 2015, 35(10): 2905-2910.
[11] (Li Chaoxiong, Huang Faliang, Wen Xiaoqian, et al.Evolution Analysis Method of Microblog Topic-Sentiment Based on Dynamic Topic Sentiment Combining Model[J]. Journal of Computer Applications, 2015, 35(10): 2905-2910.)
[12] 李慧, 胡云凤. 基于动态情感主题模型的在线评论分析[J].数据分析与知识发现, 2017, 1(9): 74-82.
[12] (Li Hui, Hu Yunfeng.Analyzing Online Reviews with Dynamic Sentiment Topic Model[J]. Data Analysis and Knowledge Discovery, 2017, 1(9): 74-82.)
[13] Gerrish S M, Blei D M.A Language-based Approach to Measuring Scholarly Impact[C]//Proceedings of International Conference on Machine Learning. DBLP, 2010: 375-382.
[1] Wenfeng Si,Guangwei Hu. Examining E-Government Services of Chinese Cities with Geographical Regions, Government Channels and Administrative Dimensions[J]. 数据分析与知识发现, 2018, 2(9): 1-9.
[2] Ling Wang,Qianjin Dai,Xiaojun Wu. The Study on the Temporal and Spatial Distribution of Event Tourism Based on Large-scale Tourism Early Warning Platform[J]. 数据分析与知识发现, 2018, 2(8): 31-40.
[3] Dongmei Mu,Ping Wang,Danning Zhao. Reducing Data Dimension of Electronic Medical Records: An Empirical Study[J]. 数据分析与知识发现, 2018, 2(1): 88-98.
[4] Yu Wang,Xiuxiu Li. Evaluating Business Reputation with E-Commerce Comments[J]. 数据分析与知识发现, 2017, 1(8): 59-67.
[5] Xueying Wang,Zixuan Zhang,Hao Wang,Sanhong Deng. Evaluating Brands of Agriculture Products: A Literature Review[J]. 数据分析与知识发现, 2017, 1(7): 13-21.
[6] Huixiang Xiong,Wuxuan Jiang. Clustering and Recommending Users Based on Tags and Relation Network[J]. 数据分析与知识发现, 2017, 1(6): 36-46.
[7] Jing Xie,Jingdong Wang,Zhenxin Wu,Zhixiong Zhang,Ying Wang,Zhifei Ye. Building Semantic Enrichment Framework for Scientific Literature Retrieval System[J]. 数据分析与知识发现, 2017, 1(4): 84-93.
[8] Bingyao Liu,Jing Ma,Xiaofeng Li. Topic Representation Model Based on “Feature Dimensionality Reduction”[J]. 数据分析与知识发现, 2017, 1(11): 53-61.
[9] Zhai Dongsheng, Cai Liwei, Zhang Jie, Feng Xiuzhen. The Study of Patent Data Warehouse-based Technical Efficiency Map Mining Method——Taking 3D Printing Technology as an Example[J]. 现代图书情报技术, 2015, 31(7-8): 131-138.
[10] Qiang Shaohua, Wu Peng. The Research of Spatial Measure of Users' Mental Model of Website Category from the View of Regional Differences[J]. 现代图书情报技术, 2015, 31(11): 68-74.
[11] Qiu Junping, Yu Houqiang. The Research Development of Visual Analytics from the Perspective of VAST Conference[J]. 现代图书情报技术, 2014, 30(10): 14-24.
[12] Li Shanjie. Application and Implementation of Two-dimensional Bar Code on Library Book Inquiry Machine[J]. 现代图书情报技术, 2014, 30(1): 97-101.
[13] Peng Jilian. The Design and Implementation of Two-dimensional Code Wayfinding Signage System in Library[J]. 现代图书情报技术, 2013, (4): 77-82.
[14] Ai Danxiang, Zuo Hui, Yang Jun. Research on Three-dimensional Personalized Recommendation Approach for C2C E-commerce Platform[J]. 现代图书情报技术, 2013, 29(1): 36-42.
[15] Zhu Wenjing, Xia Cuijuan. Application of Two-dimensional Code in Library Mobile Service ——A Case of Shanghai Library[J]. 现代图书情报技术, 2012, 28(7): 115-120.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938