Analyzing Evolution of News Topics with Manifold Learning
Xu Yuemei1(),Li Yang2,3,Liang Ye1,Cai Lianqiao1
1Department of Computer Science, Beijing Foreign Studies University, Beijing 100089, China 2Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China 3University of Chinese Academy of Sciences, Beijing 100049, China
[Objective] This study aims to examine the creation and development of online news topics, and then to gauge the public opinion. [Methods] First, we introduced the manifold learning technology to analyze the news topics. Second, we explored the relations among the high dimensional topics from each time window, which were identified by the LDA model. Third, we clustered these topics and visualized the relations among them in the low-dimensional space. Finally, we analyzed the topic evolution with the help of social network theorem. [Results] The proposed method could effectively identify the topic evolution trends of news reports on China from CNN in 2015. [Limitations] We did not fully explore the impacts of time windows. [Conclusions] This study provides a new method to visualize the evolution of news report topics over a period of time, which avoids inaccurate description due to the changing of adjacent time windows.
徐月梅,李杨,梁野,蔡连侨. 基于流形学习的新闻主题关系构建和演化研究*[J]. 现代图书情报技术, 2016, 32(10): 59-69.
Xu Yuemei,Li Yang,Liang Ye,Cai Lianqiao. Analyzing Evolution of News Topics with Manifold Learning. New Technology of Library and Information Service, 2016, 32(10): 59-69.
(Hu Yanli, Bai Liang, Zhang Weiming.Modeling and Analyzing Topic Evolution[J]. Acta Automatic Sinica, 2012, 38(10): l690-1697.)
[6]
Seung H S, Lee D D.Cognition-The Manifold Ways of Perception[J]. Science, 2000, 290(5500): 2268-2269.
[7]
Donoho D L.High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality [C]. In: Proceedings of International Conference of Mathematicians, Paris, France. 2000: 6-11.
[8]
Wang X, McCallum A. Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends [C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006: 424-433.
[9]
Blei D M, Lafferty J D.Dynamic Topic Models [C]. In: Proceedings of the 23rd International Conference on Machine Learning. 2006: 113-120.
[10]
Hall D, Jurafsky D, Manning C D.Studying the History of Ideas Using Topic Models [C]. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2008: 363-371.
(Li Baoli, Yang Xing.Analyzing Research Topic Evolution with LDA and Topic Filtering[J]. Journal of Chinese Computer Systems, 2012, 33(12): 2738-2743.)
(Qin Xiaohui, Le Xiaoqiu.Topic Evolution Research on a Certain Field Based on LDA Topic Association Filter[J]. New Technology of Library and Information Service, 2015(3): 18-25.)
[14]
Griffiths T L, Steyvers M.Finding Scientific Topics[J]. Proceedings of the National Academy Sciences of the United States of America, 2004, 101(1): 5228-5235.
[15]
Cao J, Xia T, Li J.A Density-based Method for Adaptive LDA Model Selection[J]. Neurocomputing, 2009, 72(7-9): 1775-1781.
[16]
Law M H C, Jain A K. Incremental Nonlinear Dimensionality Reduction by Manifold Learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(3): 377-391.
[17]
Tenenbaum J B, De Silva V, Langford J C.A Global Geometric Framework for Nonlinear Dimensionality Reduction[J]. Science, 2000, 290(5500): 2319-2323.
[18]
Roweis S T, Saul L K.Nonlinear Dimensionality Reduction by Locally Linear Embedding[J]. Science, 2000, 290(5500): 2323-2326.
(Manning C D, Schütze H, Raghavan P.Introduction to Information Retrieval [M]. Translated by Wang Bin. Beijing: Post &Telecom Press, 2011.)
[20]
Costa L, Da F, Rodrigues F A, et al.Characterization of Complex Networks: A Survey of Measurements[J]. Advances in Physics, 2007, 56(1): 167-242.
[21]
GooSeeker [EB/OL]. .
[22]
Hartigan J A, Wong M A.Algorithm AS: A K-means Clustering Algorithm[J]. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1979, 28(1): 100-108.
[23]
Pajek: Analysis and Visualization of Large Networks [EB/OL]. .