|
|
Analyzing Evolution of News Topics with Manifold Learning |
Xu Yuemei1(),Li Yang2,3,Liang Ye1,Cai Lianqiao1 |
1Department of Computer Science, Beijing Foreign Studies University, Beijing 100089, China 2Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China 3University of Chinese Academy of Sciences, Beijing 100049, China |
|
|
Abstract [Objective] This study aims to examine the creation and development of online news topics, and then to gauge the public opinion. [Methods] First, we introduced the manifold learning technology to analyze the news topics. Second, we explored the relations among the high dimensional topics from each time window, which were identified by the LDA model. Third, we clustered these topics and visualized the relations among them in the low-dimensional space. Finally, we analyzed the topic evolution with the help of social network theorem. [Results] The proposed method could effectively identify the topic evolution trends of news reports on China from CNN in 2015. [Limitations] We did not fully explore the impacts of time windows. [Conclusions] This study provides a new method to visualize the evolution of news report topics over a period of time, which avoids inaccurate description due to the changing of adjacent time windows.
|
Received: 13 May 2016
Published: 23 November 2016
|
[1] | Samovar L A, Porter R E, McDaniel E R, et al. Communication Between Cultures[M]. Wadsworth, 2015. | [2] | 楚克明, 李芳. 基于LDA模型的新闻主题的演化[J]. 计算机应用与软件, 20l1, 28(4): 4-7, 26. | [2] | (Chu Keming, Li Fang.LDA Model-based News Topic Evolution[J]. Computer Applications and Software, 2011, 28(4): 4-7, 26.) | [3] | Blei D M, Ng A Y, Jordan M I.Latent Dirichlet Allocation[J]. The Journal of Machine Learning Research, 2003, 3: 993-1022. | [4] | 楚克明. 基于LDA的新闻话题演化研究[D]. 上海: 上海交通大学, 2010. | [4] | (Chu Keming.The Reaearch on Topic Evolution for News Based on LDA Model [D]. Shanghai: Shanghai Jiaotong University, 2010.) | [5] | 胡艳丽, 白亮, 张维明. 一种话题演化建模与分析方法[J]. 自动化学报, 2012, 38(10): 1690-1697. | [5] | (Hu Yanli, Bai Liang, Zhang Weiming.Modeling and Analyzing Topic Evolution[J]. Acta Automatic Sinica, 2012, 38(10): l690-1697.) | [6] | Seung H S, Lee D D.Cognition-The Manifold Ways of Perception[J]. Science, 2000, 290(5500): 2268-2269. | [7] | Donoho D L.High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality [C]. In: Proceedings of International Conference of Mathematicians, Paris, France. 2000: 6-11. | [8] | Wang X, McCallum A. Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends [C]. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2006: 424-433. | [9] | Blei D M, Lafferty J D.Dynamic Topic Models [C]. In: Proceedings of the 23rd International Conference on Machine Learning. 2006: 113-120. | [10] | Hall D, Jurafsky D, Manning C D.Studying the History of Ideas Using Topic Models [C]. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2008: 363-371. | [11] | 崔凯, 周斌, 贾焰, 等. 一种基于LDA的在线主题演化挖掘模型[J]. 计算机科学, 2010, 37(11): 156-159, 193. | [11] | (Cui Kai, Zhou Bin, Jia Yan, et al.LDA-based Model for Online Topic Evolution Mining[J]. Computer Science, 2010, 37(11): 156-159, 193.) | [12] | 李保利, 杨星. 基于LDA模型和话题过滤的研究主题演化分析[J]. 小型微型计算机系统, 2012, 33(12): 2738-2743. | [12] | (Li Baoli, Yang Xing.Analyzing Research Topic Evolution with LDA and Topic Filtering[J]. Journal of Chinese Computer Systems, 2012, 33(12): 2738-2743.) | [13] | 秦晓慧, 乐小虬. 基于LDA主题关联过滤的领域主题演化研究[J]. 现代图书情报技术, 2015(3): 18-25. | [13] | (Qin Xiaohui, Le Xiaoqiu.Topic Evolution Research on a Certain Field Based on LDA Topic Association Filter[J]. New Technology of Library and Information Service, 2015(3): 18-25.) | [14] | Griffiths T L, Steyvers M.Finding Scientific Topics[J]. Proceedings of the National Academy Sciences of the United States of America, 2004, 101(1): 5228-5235. | [15] | Cao J, Xia T, Li J.A Density-based Method for Adaptive LDA Model Selection[J]. Neurocomputing, 2009, 72(7-9): 1775-1781. | [16] | Law M H C, Jain A K. Incremental Nonlinear Dimensionality Reduction by Manifold Learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(3): 377-391. | [17] | Tenenbaum J B, De Silva V, Langford J C.A Global Geometric Framework for Nonlinear Dimensionality Reduction[J]. Science, 2000, 290(5500): 2319-2323. | [18] | Roweis S T, Saul L K.Nonlinear Dimensionality Reduction by Locally Linear Embedding[J]. Science, 2000, 290(5500): 2323-2326. | [19] | Maning C D, Schütze H, Raghavan P.信息检索导论[M]. 王斌译. 北京: 人民邮电出版社, 2011. | [19] | (Manning C D, Schütze H, Raghavan P.Introduction to Information Retrieval [M]. Translated by Wang Bin. Beijing: Post &Telecom Press, 2011.) | [20] | Costa L, Da F, Rodrigues F A, et al.Characterization of Complex Networks: A Survey of Measurements[J]. Advances in Physics, 2007, 56(1): 167-242. | [21] | GooSeeker [EB/OL]. . | [22] | Hartigan J A, Wong M A.Algorithm AS: A K-means Clustering Algorithm[J]. Journal of the Royal Statistical Society: Series C (Applied Statistics), 1979, 28(1): 100-108. | [23] | Pajek: Analysis and Visualization of Large Networks [EB/OL]. . |
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|