Please wait a minute...
New Technology of Library and Information Service  2010, Vol. 26 Issue (11): 45-52    DOI: 10.11925/infotech.1003-3513.2010.11.07
article Current Issue | Archive | Adv Search |
Detection Method of Latent Burst Word Based on the Clue of Energy Evolution
Hong Na1, Zhang Zhixiong2, Le Xiaoqiu2
1. Institute of Medical Information,Chinese Academy of Medical Sciences, Beijing 100020,China;
2. National Science Library, Chinese Academy of Sciences, Beijing 100190, China
Download: PDF(524 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

This article analyzes the feasibility of latent burst word detection through tracking the clue of energy evolution, and proposes a method based on energy of words and energy evolution trends. First, it describes the life cycle and the evolution progress of words. Then based on the analysis of the energy accumulation and decay and the energy change trend, this article proposes the model evidence and establishes the EneTr model to detect the latent burst words. In addition, it proposes correspond solving method about the key problems of EneTr and implements the algorithm. Finaly, the model is separately validated by experiment on two different document streams which are Web news and scientific literature.

Key wordsTime series      Burst words      Latent burst words      Energy     
Received: 12 October 2010      Published: 04 January 2011
: 

G250.76

 

Cite this article:

Hong Na, Zhang Zhixiong, Le Xiaoqiu. Detection Method of Latent Burst Word Based on the Clue of Energy Evolution. New Technology of Library and Information Service, 2010, 26(11): 45-52.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2010.11.07     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2010/V26/I11/45


[1] Roberts I, Wentz R, Edwards P.Car Manufacturers and Global Road Safety: A Word Frequency Analysis of Road Safety Documents
[J].Injury Prevention,2006(12):320-322.

[2] 赵蓉英,冼丽莹,董菡,等.2007年国内图书馆学研究热点分析及与国外之比较研究
[J].图书情报工作网刊,2009(1):1-7.

[3] 唐琴,许侃,林鸿飞.搜索引擎发展阶段研究及热点发现
[J].情报学报,2008,27(5):664-669.

[4] Charikar M, Chen K, Farach-Colton M. Finding Frequent Items in Data Streams
[C]. In: Proceedings of the 29th International Colloquium on Automata, Languages, and Programming.2002: 1530-1541.

[5] Havre S, Hetzler B, Nowell L. ThemeRiver: Visualizing Theme Changes Over Time
[C]. In: Proceedings of Information Visualization 2000.2000:115-123.

[6] He Q, Chang K, Lim E P. Analyzing Feature Trajectories for Event Detection
[C]. In: Proceedings of Annual ACM Conference on Research and Development in Information Retrieval.2007: 207-214.

[7] Kleinberg J.Bursty and Hierarchical Structure in Streams
[J]. Data Mining and Knowledge Discovery, 2003,7(4): 373-397.

[8] 魏晓俊.基于科技文献中词语的科技发展监测方法研究
[J].情报杂志,2007 (3):34-36.

[9] Swan R, Allan J. Automatic Generation of Overview Timelines
[C]. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.2000:49-56.

[10] Swan R, Jensen D. TimeMines: Constructing Timelines with Statistical Models of Word Usage
[EB/OL].
[2010-01-12]. http://www.cs.cmu.edu/~dunja/KDDpapers/Swan_TM.pdf.

[11] Shaparenko B, Caruana R, Gehrke J, et al. Identifying Temporal Patterns and Key Players in Document Collections
[EB/OL].
[2010-09-20]. http://www.cs.cornell.edu/people/tj/publications/shaparenko_etal_05a.pdf.

[12] 章成志,梁勇.基于主题聚类的学科研究热点及其趋势监测方法
[J].情报学报,2010,29(2):342-349.

[13] Mane K K, Brner K.Mapping Topics and Topic Bursts in PNAS
[C]. In: Proceedings of the National Academy of Sciences of the United States of America. 2004: 5287-5290.

[14] 赵星,高小强,郭吉安,等. 基于主题词频和g指数的研究热点分析方法
[J]. 图书情报工作,2009,53 (2): 59-61,7.

[15] Zhang J, Tsui F C, Wagner M M, et al. Detection of Outbreaks from Time Series Data Using Wavelet Transform
[EB/OL].
[2010-09-29]. http://rods.health.pitt.edu/LIBRARY/AMIA03-JunZhang-fnl.pdf.

[16] Gosnell C F.The Rate of Obsolescence in College Library Book Collection by an Analysis of Three Select Lists of Books for College Libraries
[D]. New York: New York University,1943.

[17] Chen C C, Chen Y T, Sun Y, et al. Life Cycle Modeling of News Events Using Aging Theory
[C]. In:Proceeding of the 14th European Conference on Machine Learning.2003: 47-59.

[18] Wang C, Zhang M, Ru L,et al. Automatic Online News Topic Ranking Using Media Focus and User Attention Based on Aging Theory
[C]. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management.2008: 1033-1042.

[19] Medelyna O. Automatic Keyphrase Indexing with a Domain-Specific Thesaurus
[D]. Germany:University of Freiburg, 2005.

[20] Cimiano P, Volker J. Text2Onto-A Framework for Ontology Learning and Data-driven Change Discovery
[C]. In: Proceedings of the 10th International Conference on Applications of Natural Language to Information Systems.2005: 227-238.

[1] Yongbing Gao,Guipeng Yang,Di Zhang,Zhanfei Ma. Detecting Events from Official Weibo Profiles Based on Post Clustering with Burst Words[J]. 数据分析与知识发现, 2017, 1(9): 57-64.
[2] Qian Gao, Yang Yang, Guangwei Hu, Chao Xu, Gaofeng Shen, Jian Zhao. Analyzing Return of Investment for New Energy Project with Big Data: Case Study of SG-ERP System in Y City[J]. 数据分析与知识发现, 2016, 32(12): 57-65.
[3] He Yu, Lv Xueqiang, Xu Liping. A Chinese Term Extraction System in New Energy Vehicles Domain[J]. 现代图书情报技术, 2015, 31(10): 88-94.
[4] Wang Weijun, Bao Liqian, Liu Kai. Development Trends of Cloud Services in Time Dimension[J]. 现代图书情报技术, 2014, 30(3): 42-48.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn