Please wait a minute...
New Technology of Library and Information Service  2012, Vol. Issue (11): 86-91    DOI: 10.11925/infotech.1003-3513.2012.11.14
Current Issue | Archive | Adv Search |
A Method for Detecting the Hot Topic of Literature Based on Lifecycle——A Case Study of Neoplasm Field
Zhao Yingguang, An Xinying, Li Yong, Jia Xiaofeng
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
Download: PDF(696 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  There are some shortcomings of hot topic detection in literature,such as single index and the inefficient filtering of high-frequency common words. The paper applies lifecycle theory and TF*PDF algorithm to literature detection, which finds the hot words by tracking the variation of words over time, then locates the time hot words appeared. The results of the empirical tests show that this approach is effective in filtering high frequently used terms and identifying hot research topics in time windows.
Key wordsLifecycle theory      Hot topic detection      Text mining     
Received: 29 October 2012      Published: 06 February 2013
:  G250  

Cite this article:

Zhao Yingguang, An Xinying, Li Yong, Jia Xiaofeng. A Method for Detecting the Hot Topic of Literature Based on Lifecycle——A Case Study of Neoplasm Field. New Technology of Library and Information Service, 2012, (11): 86-91.

URL:     OR

[1] 章成志, 梁勇. 基于主题聚类的学科研究热点与研究趋势监测方法[J]. 情报学报, 2010, 29 (2): 342-349.(Zhang Chengzhi, Liang Yong. Detecting Hotspot and Trend of Disciplines Using Topic Clustering[J]. Journal of the China Society for Scientific and Technical Information, 2010,29 (2): 342-349.)
[2] Mrchen F, Dejori M, Fradkin D, et al. Anticipating Annotations and Emerging Trends in Biomedical Literature[C]. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'08), Las Vegas, Nevada, USA. New York: ACM, 2008:954-962.
[3] Swan R, Jensen D. TimeMines: Constructing TimeMines with Statistical Models of Word Usage[C]. In: Proceedings of the ACM SIGKDD 2000 Workshop on Text Mining, Boston, MA, USA. ACM, 2000:73-80.
[4] Guo H,Weingart S, Brner K. Mixed-indicators Model for Identifying Emerging Research Areas[J]. Scientometrics, 2011, 89(1):421-435.
[5] Bun K K, Ishizuka M. Topic Extraction from News Archive Using TF*PDF Algorithm[C]. In: Proceedings of the 3rd International Conference on Web Information Systems Engineering (WISE'02), Singapore. Washington, DC: IEEE Computer Society, 2002: 73-82.
[6] Chen K Y, Luesukprasert L, Chou S T. Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling[J]. IEEE Transactions on Knowledge and Data Engineering, 2007,19(8):1016-1025.
[7] Kumaran G, Allan J. Text Classification and Named Entities for New Event Detection[C]. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'04), Sheffield, UK. New York: ACM, 2004: 297-304.
[8] Pazienza M T. Information Extraction in the Web Era[M]. Springer, 2003.
[9] Hisamitsu T, Niwa Y. A Measure of Term Representativeness Based on the Number of Co-occurring Salient Words[C].In: Proceedings of the 19th International Conference on Computational Linguistics (COLING'02). Stroudsburg: Association for Computational Linguistics, 2002:1-7.
[10] Holmes D E, Jain L C. Data Mining: Foundations and Intelligent Paradigms,Volume 1: Clustering, Association and Classification[M]. Springer,2012.
[11] Bun K K, Ishizuka M. Emerging Topic Tracking System[C].In: Proceedings of the 1st Asia-Pacific Conference on Web Intelligence: Research and Development (WI'01). London:Springer-Verlag, 2001:125-130.
[12] Chen C C, Chen Y T, Sun Y S, et al. Life Cycle Modeling of News Events Using Aging Theory[C]. In: Proceedings of Machine Learning: ECML 2003. Berlin,Heidelberg:Springer-Verlag, 2003: 47-59.
[13] Liu M, Liu Y, Xiang L, et al. Extracting Key Entities and Significant Events from Online Daily News[C]. In: Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL'08), Daejeon, South Korea. Springer, 2008: 201-209.
[14] Wang C, Zhang M, Ru L, et al. Automatic Online News Topic Ranking Using Media Focus and User Attention Based on Aging Theory[C]. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM'08), Napa Valley, California. New York: ACM, 2008: 1033-1042.
[15] Zheng D H, Li F. Hot Topic Detection on BBS Using Aging Theory[C]. In: Proceedings of the International Conference on Web Information Systems and Mining (WISM'09). Berlin, Heidelberg:Springer-Verlag,2009: 129-138.
[16] Lee Y, Jung H Y, Song W S, et al. Mining the Blogosphere for Top News Stories Identification[C].In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'10), Geneva, Switzerland. New York: ACM, 2010: 395-402.
[17] MetaMap[EB/OL].[2012-09-26].
[18] 王福飞. microRNA在肿瘤诊治中的进展[J]. 江西医药, 2011, 46(6): 580-582.(Wang Fufei. The Progress of microRNA in Cancer Diagnosis and Treatment[J]. Jiangxi Medical Journal, 2011, 46(6): 580-582.)
[19] 侯萍,李剑平.肿瘤干细胞的研究进展[J]. 中国组织工程研究与临床康复,2011,15(14):2629-2632.(Hou Ping, Li Jianping. Advances in Cancer Stem Cell Research[J].Journal of Clinical Rehabilitative Tissue Engineering Research, 2011, 15(14): 2629-2632.)
[1] Yanan Yang,Wenhui Zhao,Jian Zhang,Shen Tan,Beibei Zhang. Visualizing Policy Texts Based on Multi-View Collaboration[J]. 数据分析与知识发现, 2019, 3(6): 30-41.
[2] Mengji Zhang,Wanyu Du,Nan Zheng. Predicting Stock Trends Based on News Events[J]. 数据分析与知识发现, 2019, 3(5): 11-18.
[3] Ning Zhang,Lemin Yin,Lifeng He. Impacts of “Poster-Follower” Sentiment on Stock Market Performance[J]. 数据分析与知识发现, 2018, 2(6): 1-12.
[4] Xinyue Fan,Lei Cui. Using Text Mining to Discover Drug Side Effects: Case Study of PubMed[J]. 数据分析与知识发现, 2018, 2(3): 79-86.
[5] Qiangbing Wang,Chengzhi Zhang. Constructing Users Profiles with Content and Gesture Behaviors[J]. 数据分析与知识发现, 2017, 1(2): 80-86.
[6] Xiufang Xie,Xiaolin Zhang. Integrated Analysis and Visualization of Sci-Tech Roadmaps: Case Study of Renewable Energy[J]. 数据分析与知识发现, 2017, 1(1): 16-25.
[7] Yao Zhaoxu,Ma Jing. Extracting Topic and Opinion from Microblog Posts with New Algorithm[J]. 现代图书情报技术, 2016, 32(7-8): 78-86.
[8] Lan Qiujun,Liu Wenxing,Li Weikang,Hu Xingye. Sentiment Analysis of Financial Forum Textual Message[J]. 现代图书情报技术, 2016, 32(4): 64-71.
[9] Qiang Bi, Jian Liu, Yulai Bao. A New Text Clustering Method Based on Semantic Similarity[J]. 数据分析与知识发现, 2016, 32(12): 9-16.
[10] Lin Yuanyuan,Zhan Hongfei,Yu Junhe,Li Changjiang,Zhang Fan. Using Product Reviews to Analyze Sentiment Fluctuation of Consumer[J]. 现代图书情报技术, 2016, 32(11): 44-53.
[11] Zhao Dongxiao,Wang Xiaoyue,Bai Rujiang,Liu Ziqiang. Semantic Text Mining Methodologies for Intelligence Analysis[J]. 现代图书情报技术, 2016, 32(10): 13-24.
[12] Sui Mingshuang,Cui Lei. Extracting Chemical and Disease Named Entities with Multiple-Feature CRF Model[J]. 现代图书情报技术, 2016, 32(10): 91-97.
[13] Ruyi Yang,Dongsu Liu,Hui Li. An Improved Topic Model Integrating Extra-Features[J]. 现代图书情报技术, 2016, 32(1): 48-54.
[14] Wang Ying, Wu Zhenxin, Xie Jing. Review on Semantic Retrieval System for Scientific Literature[J]. 现代图书情报技术, 2015, 31(5): 1-7.
[15] Hao Mei, Yang Xiaoyuan. Credibility Research on Chinese Online Customer Reviews[J]. 现代图书情报技术, 2015, 31(2): 55-63.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938