Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (2/3): 200-206    DOI: 10.11925/infotech.2096-3467.2019.0634
Current Issue | Archive | Adv Search |
Tracking Static Topics with Bayesian Network
Xu Jianmin(),Zhang Liqing,Wang Miao
School of Cyber Security and Computer, Hebei University, Baoding 071002, China
Download: PDF(710 KB)   HTML ( 0
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] The paper analyzed the feasibility of using Bayesian network for topic tracking, and proposed a new method to improve its performance.[Methods] We constructed two topic tracking models, one with Bayesian Network, and the other with Extended Bayesian Network. The nodes in the models represent terms, events and topics, while the arcs represent relationships among nodes. Finally, we calculated the similarity among topics, events and reports with the Propagation and Evaluation method.[Results] We examined our models on TDT4 data set and found the DET curve of the Bayesian Network model was below the curve of vector space topic model, the former had better performance. The result of extended Bayesian network topic tracking model was 1.7% higher than the first one.[Limitations] Extended Bayesian network topic tracking model was a static topic model while events were generated by the evolution of topics, so the model had limited performance improvement.[Conclusions] The new models can describe the structural relationships among topics, events and stories, and conduct probability inference, which improve the performance of topic tracking effectively.

Key wordsBayesian Network      Topic Tracking      Event      Static Topic Model     
Received: 10 June 2019      Published: 26 April 2020
ZTFLH:  TP391.1  
Corresponding Authors: Jianmin Xu     E-mail: hbuxjm@hbu.edu.cn

Cite this article:

Xu Jianmin,Zhang Liqing,Wang Miao. Tracking Static Topics with Bayesian Network. Data Analysis and Knowledge Discovery, 2020, 4(2/3): 200-206.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0634     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I2/3/200

Bayesian Network
BNTT Model
E_BNTT Model
真实为“是” 真实为“否”
模型判断为“是” a b
模型判断为“否” c d
Parameters Description
δ Pmiss Pfa optimal((Cdet)norm)
0.05 0.093 46 0.012 81 0.156 21
0.10 0.074 77 0.013 15 0.139 22
0.15 0.065 42 0.015 58 0.141 74
0.20 0.062 31 0.018 00 0.150 50
0.25 0.096 57 0.015 58 0.172 90
0.30 0.093 46 0.016 61 0.174 87
0.35 0.115 26 0.020 08 0.213 64
Performance of E_BNTT Model with Different Values of Parameter δ
Performance of BNTT and VSM
性能

模型
BNTT E_BNTT
Pmiss 0.093 46 0.065 42
Pfa 0.012 81 0.015 58
optimal((Cdet)norm) 0.156 21 0.139 22
Performance of BNTT and E_BNTT
[1] 洪宇, 仓玉, 姚建民 , 等. 话题跟踪中静态和动态话题模型的核捕捉衰减[J]. 软件学报, 2012,23(5):1100-1119.
[1] ( Hong Yu, Cang Yu, Yao Jianmin , et al. Descending Kernel Track of Static and Dynamic Topic Models in Topic Tracking[J]. Journal of Software, 2012,23(5):1100-1119.)
[2] Allan J, Papka R, Lavrenko V . On-Line New Event Detection and Tracking [C]// Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 1998: 37-75.
[3] 屈庆涛, 刘其成, 牟春晓 . 基于N-Gram语言模型的并行自适应新闻话题追踪算法[J]. 山东大学学报:工学版, 2018,48(6):37-43.
[3] ( Qu Qingtao, Liu Qicheng, Mu Chunxiao . A Parallel Adaptive News Topic Tracking Algorithm Based on N-Gram Language Model[J]. Journal of Shandong University: Engineering Science, 2018,48(6):37-43.)
[4] 王亚民, 胡悦 . 基于BTM的微博舆情热点发现[J]. 情报杂志, 2016,35(11):119-124, 140.
[4] ( Wang Yamin, Hu Yue . Hotspot Detection in Microblog Public Opinion Based on Biterm Topic Model[J]. Journal of Intelligence, 2016,35(11):119-124, 140.)
[5] 宋莉娜, 冯旭鹏, 刘利军 , 等. 基于SOM聚类的微博话题发现[J]. 计算机应用研究, 2018,35(3):671-674, 679.
[5] ( Song Lina, Feng Xupeng, Liu Lijun , et al. Microblog Topics Detection Based on SOM Clustering[J]. Application Research of Computers, 2018,35(3):671-674, 679.)
[6] Xu J M, Wu S F, Hong Y . Topic Tracking with Bayesian Belief Network[J]. Optik, 2014,125(9):2164-2169.
[7] De Campos L M, Fernández-Luna J M, Huete J F . The BNR Model: Foundations and Performance of a Bayesian Network-Based Retrieval Model[J]. International Journal of Approximate Reasoning, 2003,34(2-3):265-285.
[8] Doddington G, Fiscus J . The 2002 Topic Detection and Tracking (TDT2002) Task Definition and Evaluation Plan[R]. 2002.
[9] 郑伟, 侯宏旭, 武静 . 贝叶斯网络在信息检索中的应用[J]. 情报科学, 2018,36(6):136-141.
[9] ( Zheng Wei, Hou Hongxu, Wu Jing . Application of Bayesian Network for Information Retrieval[J]. Information Science, 2018,36(6):136-141.)
[10] Turtle H R, Croft W B . Inference Networks for Document Retrieval [C]// Proceedings of the 13th SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 1989: 1-24.
[11] Ribeiro-Neto B A N, Muntz R . A Belief Network Model for IR [C]// Proceedings of the 19th ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 1996: 253-260.
[12] Acid S, De Campos L M, Fernández-Luna J M , et al. An Information Retrieval Model Based on Simple Bayesian Networks[J]. International Journal of Intelligent Systems, 2003,18(2):251-265.
[13] 周楠, 杜攀, 靳小龙 , 等. 面向舆情事件的子话题标签生成模型ET-TAG[J]. 计算机学报, 2018,41(7):1490-1503.
[13] ( Zhou Nan, Du Pan, Jin Xiaolong , et al. ET-TAG: A Tag Generation Model for the Sub-Topic of Public Opinion Events[J]. Chinese Journal of Computers, 2018,41(7):1490-1503.)
[14] 郑伟, 张宇, 邹博伟 , 等. 基于相关性模型的中文话题跟踪研究[C]// 第九届全国计算语言学学术会议论文集. 中国中文信息学会, 2007: 558-563.
[14] ( Zheng Wei, Zhang Yu, Zou Bowei , et al. Research of Chinese Topic Tracking Based on Relevance Model[C]// Proceedings of the 9th China National Conference on Computational Linguistics. Chinese Information Processing Society of China, 2007: 558-563.)
[1] Liang Yanping,An Lu,Liu Jing. Topic Resonance of Micro-blogs on Similar Public Health Emergencies[J]. 数据分析与知识发现, 2020, 4(2/3): 122-133.
[2] Liu Yuwen,Wang Kai. Finding Geographic Locations of Popular Online Topics[J]. 数据分析与知识发现, 2020, 4(2/3): 173-181.
[3] Ling Wang,Qianjin Dai,Xiaojun Wu. The Study on the Temporal and Spatial Distribution of Event Tourism Based on Large-scale Tourism Early Warning Platform[J]. 数据分析与知识发现, 2018, 2(8): 31-40.
[4] Huihui Tang,Hao Wang,Zixuan Zhang,Xueying Wang. Extracting Names of Historical Events Based on Chinese Character Tags[J]. 数据分析与知识发现, 2018, 2(7): 89-100.
[5] Jiaqi Wang,Junsheng Zhang,Xiaodong Qiao. Analyzing Representation and Semantic Links of Scientific Research Events[J]. 数据分析与知识发现, 2018, 2(5): 32-39.
[6] Yonghua Cen,Can Zhang,Chengyao Wu. Media Information and Overtrading——An Empirical Study on Micro-Blog Posts, Industry News and Company Announcements[J]. 数据分析与知识发现, 2018, 2(4): 20-28.
[7] Yongbing Gao,Guipeng Yang,Di Zhang,Zhanfei Ma. Detecting Events from Official Weibo Profiles Based on Post Clustering with Burst Words[J]. 数据分析与知识发现, 2017, 1(9): 57-64.
[8] Dongbo Wang,Yi Wu,Wenhao Ye,Ruilun Liu. Extracting Events of Food Safety Emergencies with Characteristics Knowledge[J]. 数据分析与知识发现, 2017, 1(3): 54-61.
[9] Bincan Yin,Shichao Xin,Han Zhang,Yuhong Zhao. Building Asian Tumor-patients Prognostic Model with Bayesian Network and SEER Database——Case Study of Non-Small Cell Lung Cancer[J]. 数据分析与知识发现, 2017, 1(2): 41-46.
[10] Ding Shengchun,Gong Silan,Li Hongmei. A New Method to Detect Bursty Events from Micro-blog Posts Based on Bursty Topic Words and Agglomerative Hierarchical Clustering Algorithm[J]. 现代图书情报技术, 2016, 32(7-8): 12-20.
[11] Li Jinhua,An Zhongjie. Analyzing Geographical Coordinates Data for Micro-blog Trending Events[J]. 现代图书情报技术, 2016, 32(2): 90-101.
[12] Ma Jing,He Xuefeng,Jian Xuwen. Automatically Building “Feature Items Ontology” for Trending Topics[J]. 现代图书情报技术, 2016, 32(10): 33-41.
[13] Qin Xiaohui, Le Xiaoqiu. Topic Sources and Trends Tracking Towards Citation Network of Single Paper[J]. 现代图书情报技术, 2015, 31(9): 52-59.
[14] Wu Peng, Yang Shuang, Zhang Jingjing, Gao Qingning. Agent-Based Modeling and Simulation of Evolution of Netizen Crowd Behavior in Unexpected Events Public Opinion[J]. 现代图书情报技术, 2015, 31(7-8): 65-72.
[15] Zhuo Keqiu, Yu Wei, Su Xinning. Parallel Implementing Bursty Events Detection Using MapReduce[J]. 现代图书情报技术, 2015, 31(2): 46-54.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn