Please wait a minute...
New Technology of Library and Information Service  2016, Vol. 32 Issue (10): 59-69    DOI: 10.11925/infotech.1003-3513.2016.10.07
Orginal Article Current Issue | Archive | Adv Search |
Analyzing Evolution of News Topics with Manifold Learning
Xu Yuemei1(),Li Yang2,3,Liang Ye1,Cai Lianqiao1
1Department of Computer Science, Beijing Foreign Studies University, Beijing 100089, China
2Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China
3University of Chinese Academy of Sciences, Beijing 100049, China
Export: BibTeX | EndNote (RIS)      

[Objective] This study aims to examine the creation and development of online news topics, and then to gauge the public opinion. [Methods] First, we introduced the manifold learning technology to analyze the news topics. Second, we explored the relations among the high dimensional topics from each time window, which were identified by the LDA model. Third, we clustered these topics and visualized the relations among them in the low-dimensional space. Finally, we analyzed the topic evolution with the help of social network theorem. [Results] The proposed method could effectively identify the topic evolution trends of news reports on China from CNN in 2015. [Limitations] We did not fully explore the impacts of time windows. [Conclusions] This study provides a new method to visualize the evolution of news report topics over a period of time, which avoids inaccurate description due to the changing of adjacent time windows.

Key wordsLatent Dirichlet Allocation      Manifold learning      Topic relevance      Topic evolution     
Received: 13 May 2016      Published: 23 November 2016

Cite this article:

Xu Yuemei,Li Yang,Liang Ye,Cai Lianqiao. Analyzing Evolution of News Topics with Manifold Learning. New Technology of Library and Information Service, 2016, 32(10): 59-69.

URL:     OR

[1] Samovar L A, Porter R E, McDaniel E R, et al. Communication Between Cultures[M]. Wadsworth, 2015.
[2] 楚克明, 李芳. 基于LDA模型的新闻主题的演化[J]. 计算机应用与软件, 20l1, 28(4): 4-7, 26.
[2] (Chu Keming, Li Fang.LDA Model-based News Topic Evolution[J]. Computer Applications and Software, 2011, 28(4): 4-7, 26.)
[3] Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022.
[4] 楚克明. 基于LDA的新闻话题演化研究[D]. 上海: 上海交通大学, 2010.
[4] (Chu Keming.The Reaearch on Topic Evolution for News Based on LDA Model [D]. Shanghai: Shanghai Jiaotong University, 2010.)
[5] 胡艳丽, 白亮, 张维明. 一种话题演化建模与分析方法[J]. 自动化学报, 2012, 38(10): 1690-1697.
[5] (Hu Yanli, Bai Liang, Zhang Weiming.Modeling and Analyzing Topic Evolution[J]. Acta Automatic Sinica, 2012, 38(10): l690-1697.)
[6] Seung H S, Lee D D.Cognition-The Manifold Ways of Perception[J]. Science, 2000, 290(5500): 2268-2269.
[7] Donoho D L.High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality [C]. In: Proceedings of International Conference of Mathematicians, Paris, France. 2000: 6-11.
[8] Wang X, McCallum A. Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends [C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006: 424-433.
[9] Blei D M, Lafferty J D.Dynamic Topic Models [C]. In: Proceedings of the 23rd International Conference on Machine Learning. 2006: 113-120.
[10] Hall D, Jurafsky D, Manning C D.Studying the History of Ideas Using Topic Models [C]. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2008: 363-371.
[11] 崔凯, 周斌, 贾焰, 等. 一种基于LDA的在线主题演化挖掘模型[J]. 计算机科学, 2010, 37(11): 156-159, 193.
[11] (Cui Kai, Zhou Bin, Jia Yan, et al.LDA-based Model for Online Topic Evolution Mining[J]. Computer Science, 2010, 37(11): 156-159, 193.)
[12] 李保利, 杨星. 基于LDA模型和话题过滤的研究主题演化分析[J]. 小型微型计算机系统, 2012, 33(12): 2738-2743.
[12] (Li Baoli, Yang Xing.Analyzing Research Topic Evolution with LDA and Topic Filtering[J]. Journal of Chinese Computer Systems, 2012, 33(12): 2738-2743.)
[13] 秦晓慧, 乐小虬. 基于LDA主题关联过滤的领域主题演化研究[J]. 现代图书情报技术, 2015(3): 18-25.
[13] (Qin Xiaohui, Le Xiaoqiu.Topic Evolution Research on a Certain Field Based on LDA Topic Association Filter[J]. New Technology of Library and Information Service, 2015(3): 18-25.)
[14] Griffiths T L, Steyvers M.Finding Scientific Topics[J]. Proceedings of the National Academy Sciences of the United States of America, 2004, 101(1): 5228-5235.
[15] Cao J, Xia T, Li J.A Density-based Method for Adaptive LDA Model Selection[J]. Neurocomputing, 2009, 72(7-9): 1775-1781.
[16] Law M H C, Jain A K. Incremental Nonlinear Dimensionality Reduction by Manifold Learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(3): 377-391.
[17] Tenenbaum J B, De Silva V, Langford J C.A Global Geometric Framework for Nonlinear Dimensionality Reduction[J]. Science, 2000, 290(5500): 2319-2323.
[18] Roweis S T, Saul L K.Nonlinear Dimensionality Reduction by Locally Linear Embedding[J]. Science, 2000, 290(5500): 2323-2326.
[19] Maning C D, Schütze H, Raghavan P.信息检索导论[M]. 王斌译. 北京: 人民邮电出版社, 2011.
[19] (Manning C D, Schütze H, Raghavan P.Introduction to Information Retrieval [M]. Translated by Wang Bin. Beijing: Post &Telecom Press, 2011.)
[20] Costa L, Da F, Rodrigues F A, et al.Characterization of Complex Networks: A Survey of Measurements[J]. Advances in Physics, 2007, 56(1): 167-242.
[21] GooSeeker [EB/OL]. .
[22] Hartigan J A, Wong M A.Algorithm AS: A K-means Clustering Algorithm[J]. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1979, 28(1): 100-108.
[23] Pajek: Analysis and Visualization of Large Networks [EB/OL]. .
[1] Wang Hongbin,Wang Jianxiong,Zhang Yafei,Yang Heng. Topic Recognition of News Reports with Imbalanced Contents[J]. 数据分析与知识发现, 2021, 5(3): 109-120.
[2] Shen Si,Li Qinyu,Ye Yuan,Sun Hao,Ye Wenhao. Topic Mining and Evolution Analysis of Medical Sci-Tech Reports with TWE Model[J]. 数据分析与知识发现, 2021, 5(3): 35-44.
[3] Wang Wei, Gao Ning, Xu Yuting, Wang Hongwei. Topic Evolution of Online Reviews for Crowdfunding Campaigns[J]. 数据分析与知识发现, 2021, 5(10): 103-123.
[4] Liu Qian, Li Chenliang. A Survey of Topic Evolution on Social Media[J]. 数据分析与知识发现, 2020, 4(8): 1-14.
[5] Yue Lixin,Liu Ziqiang,Hu Zhengyin. Evolution Analysis of Hot Topics with Trend-Prediction[J]. 数据分析与知识发现, 2020, 4(6): 22-34.
[6] Hongfei Ling,Shiyan Ou. Review of Automatic Labeling for Topic Models[J]. 数据分析与知识发现, 2019, 3(9): 16-26.
[7] Mingzhu Sun,Jing Ma,Lingfei Qian. Extracting Keywords Based on Topic Structure and Word Diagram Iteration[J]. 数据分析与知识发现, 2019, 3(8): 68-76.
[8] Peiyao Zhang,Dongsu Liu. Topic Evolutionary Analysis of Short Text Based on Word Vector and BTM[J]. 数据分析与知识发现, 2019, 3(3): 95-101.
[9] Hongqinling Wang,Zhichao Ba,Gang Li. Conversational Topic Intensity Calculation and Evolution Analysis of WeChat Group[J]. 数据分析与知识发现, 2019, 3(2): 33-42.
[10] Gang Li,Sijing Chen,Jin Mao,Yansong Gu. Spatio-Temporal Comparison of Microblog Trending Topics on Natural Disasters[J]. 数据分析与知识发现, 2019, 3(11): 1-15.
[11] Xu Yuemei,Lv Sining,Cai Lianqiao,Zhang Xiaoya. Analyzing News Topic Evolution with Convolutional Neural Networks and Topic2Vec[J]. 数据分析与知识发现, 2018, 2(9): 31-41.
[12] Wang Jingqi,Li Rui,Wu Huayi. The Evolution of Online Public Opinion Based on Spatial Autocorrelation[J]. 数据分析与知识发现, 2018, 2(2): 64-73.
[13] He Weilin,Feng Guohe,Xie Hongling. Analyzing Scientific Literature with Content Similarity - Topics over Time Model[J]. 数据分析与知识发现, 2018, 2(11): 64-72.
[14] Wang Yuefen,Jin Jialin. Characteristics and Development Trends of Papers from “New Technology of Library and Information Service”[J]. 现代图书情报技术, 2016, 32(9): 1-16.
[15] Hong Ma, Yongming Cai. A CA-LDA Model for Chinese Topic Analysis: Case Study of Transportation Law Literature[J]. 数据分析与知识发现, 2016, 32(12): 17-26.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938