Please wait a minute...
Data Analysis and Knowledge Discovery  2018, Vol. 2 Issue (9): 31-41    DOI: 10.11925/infotech.2096-3467.2018.0068
Current Issue | Archive | Adv Search |
Analyzing News Topic Evolution with Convolutional Neural Networks and Topic2Vec
Yuemei Xu1(),Sining Lv1,Lianqiao Cai1,Xiaoya Zhang2
1Department of Computer Science, Beijing Foreign Studies University, Beijing 100089, China
2School of International Journalism and Communication, Beijing Foreign Studies University, Beijing 100089, China
Download: PDF(1934 KB)   HTML ( 5
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This study analyzes the evolution of news topics, aiming to identify the public opinion and media coverage of certain events. [Methods] We proposed a word distributed representation method based on Topic2Vec to improve the semantic distance of topics. Then, we introduced the convolutional neural networks model to learn the topic vectors and cluster the similar ones. Finally, we obtained the topics’ evolution trends, focus events and related key sub-topics. [Results] We collected news reports on China from the website of CNN between 2015 and 2017 as datasets to examine the proposed method, which effectively revealed the evolution of topics and sentiments. [Limitations] We did not explore the impacts of time window length. [Conclusions] Compared with previous models, the proposed method improves the accuracy of topic clustering by 10% and helps us explore the topic evolution of news.

Key wordsNews Topic      Convolutional Neural Networks      Topic Evolution      Topic2Vec     
Received: 18 January 2018      Published: 25 October 2018

Cite this article:

Yuemei Xu,Sining Lv,Lianqiao Cai,Xiaoya Zhang. Analyzing News Topic Evolution with Convolutional Neural Networks and Topic2Vec. Data Analysis and Knowledge Discovery, 2018, 2(9): 31-41.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0068     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2018/V2/I9/31

[1] Hoffman M, Bach F R, Blei D M.Online Learning for Latent Dirichlet Allocation[C]//Proceedings of the Neural Information Processing Systems Conference. 2010: 1-9.
[2] Chen F, Chiu P, Lim S.Topic Modeling of Document Metadata for Visualizing Collaborations over Time[C]//Proceedings of the 21st International Conference on Intelligent User Interfaces, California, USA. ACM, 2016:108-117.
[3] He Y, Lin C.Joint Sentiment/Topic Model for Sentiment Analysis[C]//Proceedings of the 18th ACM Conference on Information and Knowledge Management, Hong Kong, China,2009: 375-384.
[4] Lin C, He Y, Everson R, et al.Weakly Supervised Joint Sentiment-Topic Detection from Text[J]. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(6): 1134-1145.
[5] Hofmann T.Probabilistic Latent Semantic Indexing[J]. ACM SIGIR Forum-SIGIR Test-of-Time Awardees 1978-2001, 2017, 51(2): 211-218.
[6] Kim S, Zhang J, Chen Z, et al.A Hierarchical Aspect-Sentiment Model for Online Reviews[C]//Proceedings of the 27th AAAI Conference on Artificial Intelligence. 2013: 526-533.
[7] Ma C, Wang M, Chen X.Topic and Sentiment Unification Maximum Entropy Model for Online Review Analysis[C]//Proceedings of International World Wide Web Conference, Florence, Italy. 2015: 649-654.
[8] Zhu C, Zhu H, Ge Y, et al.Tracking the Evolution of Social Emotions with Topic Models[J].Knowledge and Information Systems, 2016, 47(3): 517-544.
[9] 黄卫东, 陈凌云, 吴美蓉. 网络舆情话题情感演化研究[J]. 情报杂志, 2014, 33(1): 102-107.
[9] (Huang Weidong, Chen Lingyun, Wu Meirong.Research on Sentiment Evaluation of Online Public Opinion Topic[J]. Journal of Intelligence,2014, 33(1): 102-107.)
[10] Hall D, Jurafsky D, Manning C D.Studying the History of Ideas Using Topic Models[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, Honolulu, Hawaii, USA. 2008: 363-371.
[11] Iwata T, Yamada T, Sakurai Y, et al.Online Multiscale Dynamic Topic Models[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, USA. 2010: 663-672.
[12] Kim Y.Convolutional Neural Networks for Sentence Classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar. 2014:1746-1751.
[13] Hutto C J, Gilbert E.VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text[C]//Proceedings of the 8th International AAAI Conference on Weblogs and Social Media, Michigan, USA. 2014: 216-225.
[14] Jonathon S.Notes on Kullback-Leibler Divergence and Likelihood[OL]. arXiv Preprint, arXiv: 1404.2000.
[15] GooSeeker[OL]. [2017-02-14]. .
[16] Zhao W, Chen J J, Perkins R.A Heuristic Approach to Determine an Appropriate Number of Topics in Topic Modeling[C]//Proceedings of the 12th Annual MCBIOS Conference, Arkansas, USA. 2017: 123-131.
[17] Mikolov T, Sutskever I, Chen K, et al.Distributed Representations of Words and Phrases and Their Compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26(13): 3111-3119.
[18] Yang B, Xiang M, Zhang Y.Multi-manifold Discriminant Isomap for Visualization and Classification[J]. Pattern Recognition, 2016, 55(1): 215-230.
[1] Peiyao Zhang,Dongsu Liu. Topic Evolutionary Analysis of Short Text Based on Word Vector and BTM[J]. 数据分析与知识发现, 2019, 3(3): 95-101.
[2] Hongqinling Wang,Zhichao Ba,Gang Li. Conversational Topic Intensity Calculation and Evolution Analysis of WeChat Group[J]. 数据分析与知识发现, 2019, 3(2): 33-42.
[3] Jingqi Wang,Rui Li,Huayi Wu. The Evolution of Online Public Opinion Based on Spatial Autocorrelation[J]. 数据分析与知识发现, 2018, 2(2): 64-73.
[4] Weilin He,Guohe Feng,Hongling Xie. Analyzing Scientific Literature with Content Similarity - Topics over Time Model[J]. 数据分析与知识发现, 2018, 2(11): 64-72.
[5] Wang Yuefen,Jin Jialin. Characteristics and Development Trends of Papers from “New Technology of Library and Information Service”[J]. 现代图书情报技术, 2016, 32(9): 1-16.
[6] Zhao Dongxiao,Wang Xiaoyue,Bai Rujiang,Liu Ziqiang. Semantic Text Mining Methodologies for Intelligence Analysis[J]. 现代图书情报技术, 2016, 32(10): 13-24.
[7] Xu Yuemei,Li Yang,Liang Ye,Cai Lianqiao. Analyzing Evolution of News Topics with Manifold Learning[J]. 现代图书情报技术, 2016, 32(10): 59-69.
[8] Qin Xiaohui, Le Xiaoqiu. Topic Evolution Research on a Certain Field Based on LDA Topic Association Filter[J]. 现代图书情报技术, 2015, 31(3): 18-25.
[9] Zhao Yingguang, Hong Na, An Xinying. A Survey of the Approach of Topic Evolution Model Based on Topic Model[J]. 现代图书情报技术, 2014, 30(10): 63-69.
[10] He Liang, Li Fang. Topic Evolution in Scientific Literature[J]. 现代图书情报技术, 2012, 28(4): 61-67.
[11] Shan Bin, Li Fang. Topic Evolution Based on Seminal Document and Topic Model[J]. 现代图书情报技术, 2011, 27(7/8): 104-109.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn