Analyzing News Topic Evolution with Convolutional Neural Networks and Topic2Vec
Xu Yuemei1(), Lv Sining1, Cai Lianqiao1, Zhang Xiaoya2
1Department of Computer Science, Beijing Foreign Studies University, Beijing 100089, China 2School of International Journalism and Communication, Beijing Foreign Studies University, Beijing 100089, China
[Objective] This study analyzes the evolution of news topics, aiming to identify the public opinion and media coverage of certain events. [Methods] We proposed a word distributed representation method based on Topic2Vec to improve the semantic distance of topics. Then, we introduced the convolutional neural networks model to learn the topic vectors and cluster the similar ones. Finally, we obtained the topics’ evolution trends, focus events and related key sub-topics. [Results] We collected news reports on China from the website of CNN between 2015 and 2017 as datasets to examine the proposed method, which effectively revealed the evolution of topics and sentiments. [Limitations] We did not explore the impacts of time window length. [Conclusions] Compared with previous models, the proposed method improves the accuracy of topic clustering by 10% and helps us explore the topic evolution of news.
Hoffman M, Bach F R, Blei D M.Online Learning for Latent Dirichlet Allocation[C]//Proceedings of the Neural Information Processing Systems Conference. 2010: 1-9.
Chen F, Chiu P, Lim S.Topic Modeling of Document Metadata for Visualizing Collaborations over Time[C]//Proceedings of the 21st International Conference on Intelligent User Interfaces, California, USA. ACM, 2016:108-117.
He Y, Lin C.Joint Sentiment/Topic Model for Sentiment Analysis[C]//Proceedings of the 18th ACM Conference on Information and Knowledge Management, Hong Kong, China,2009: 375-384.
Lin C, He Y, Everson R, et al.Weakly Supervised Joint Sentiment-Topic Detection from Text[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(6): 1134-1145.
(Huang Weidong, Chen Lingyun, Wu Meirong.Research on Sentiment Evaluation of Online Public Opinion Topic[J]. Journal of Intelligence,2014, 33(1): 102-107.)
Hall D, Jurafsky D, Manning C D.Studying the History of Ideas Using Topic Models[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, USA. 2008: 363-371.
Iwata T, Yamada T, Sakurai Y, et al.Online Multiscale Dynamic Topic Models[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, USA. 2010: 663-672.
Kim Y.Convolutional Neural Networks for Sentence Classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar. 2014:1746-1751.
Hutto C J, Gilbert E.VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text[C]//Proceedings of the 8th International AAAI Conference on Weblogs and Social Media, Michigan, USA. 2014: 216-225.
Jonathon S.Notes on Kullback-Leibler Divergence and Likelihood[OL]. arXiv Preprint, arXiv: 1404.2000.
GooSeeker[OL]. [2017-02-14]. .
Zhao W, Chen J J, Perkins R.A Heuristic Approach to Determine an Appropriate Number of Topics in Topic Modeling[C]//Proceedings of the 12th Annual MCBIOS Conference, Arkansas, USA. 2017: 123-131.
Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26(13): 3111-3119.
Yang B, Xiang M, Zhang Y.Multi-manifold Discriminant Isomap for Visualization and Classification[J]. Pattern Recognition, 2016, 55(1): 215-230.