Please wait a minute...
New Technology of Library and Information Service  2012, Vol. Issue (11): 86-91    DOI: 10.11925/infotech.1003-3513.2012.11.14
Current Issue | Archive | Adv Search |
A Method for Detecting the Hot Topic of Literature Based on Lifecycle——A Case Study of Neoplasm Field
Zhao Yingguang, An Xinying, Li Yong, Jia Xiaofeng
Institute of Medical Information, Chinese Academy of Medical Sciences, Beijing 100020, China
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  There are some shortcomings of hot topic detection in literature,such as single index and the inefficient filtering of high-frequency common words. The paper applies lifecycle theory and TF*PDF algorithm to literature detection, which finds the hot words by tracking the variation of words over time, then locates the time hot words appeared. The results of the empirical tests show that this approach is effective in filtering high frequently used terms and identifying hot research topics in time windows.
Key wordsLifecycle theory      Hot topic detection      Text mining     
Received: 29 October 2012      Published: 06 February 2013
:  G250  

Cite this article:

Zhao Yingguang, An Xinying, Li Yong, Jia Xiaofeng. A Method for Detecting the Hot Topic of Literature Based on Lifecycle——A Case Study of Neoplasm Field. New Technology of Library and Information Service, 2012, (11): 86-91.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2012.11.14     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2012/V/I11/86

[1] 章成志, 梁勇. 基于主题聚类的学科研究热点与研究趋势监测方法[J]. 情报学报, 2010, 29 (2): 342-349.(Zhang Chengzhi, Liang Yong. Detecting Hotspot and Trend of Disciplines Using Topic Clustering[J]. Journal of the China Society for Scientific and Technical Information, 2010,29 (2): 342-349.)
[2] Mrchen F, Dejori M, Fradkin D, et al. Anticipating Annotations and Emerging Trends in Biomedical Literature[C]. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'08), Las Vegas, Nevada, USA. New York: ACM, 2008:954-962.
[3] Swan R, Jensen D. TimeMines: Constructing TimeMines with Statistical Models of Word Usage[C]. In: Proceedings of the ACM SIGKDD 2000 Workshop on Text Mining, Boston, MA, USA. ACM, 2000:73-80.
[4] Guo H,Weingart S, Brner K. Mixed-indicators Model for Identifying Emerging Research Areas[J]. Scientometrics, 2011, 89(1):421-435.
[5] Bun K K, Ishizuka M. Topic Extraction from News Archive Using TF*PDF Algorithm[C]. In: Proceedings of the 3rd International Conference on Web Information Systems Engineering (WISE'02), Singapore. Washington, DC: IEEE Computer Society, 2002: 73-82.
[6] Chen K Y, Luesukprasert L, Chou S T. Hot Topic Extraction Based on Timeline Analysis and Multidimensional Sentence Modeling[J]. IEEE Transactions on Knowledge and Data Engineering, 2007,19(8):1016-1025.
[7] Kumaran G, Allan J. Text Classification and Named Entities for New Event Detection[C]. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'04), Sheffield, UK. New York: ACM, 2004: 297-304.
[8] Pazienza M T. Information Extraction in the Web Era[M]. Springer, 2003.
[9] Hisamitsu T, Niwa Y. A Measure of Term Representativeness Based on the Number of Co-occurring Salient Words[C].In: Proceedings of the 19th International Conference on Computational Linguistics (COLING'02). Stroudsburg: Association for Computational Linguistics, 2002:1-7.
[10] Holmes D E, Jain L C. Data Mining: Foundations and Intelligent Paradigms,Volume 1: Clustering, Association and Classification[M]. Springer,2012.
[11] Bun K K, Ishizuka M. Emerging Topic Tracking System[C].In: Proceedings of the 1st Asia-Pacific Conference on Web Intelligence: Research and Development (WI'01). London:Springer-Verlag, 2001:125-130.
[12] Chen C C, Chen Y T, Sun Y S, et al. Life Cycle Modeling of News Events Using Aging Theory[C]. In: Proceedings of Machine Learning: ECML 2003. Berlin,Heidelberg:Springer-Verlag, 2003: 47-59.
[13] Liu M, Liu Y, Xiang L, et al. Extracting Key Entities and Significant Events from Online Daily News[C]. In: Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL'08), Daejeon, South Korea. Springer, 2008: 201-209.
[14] Wang C, Zhang M, Ru L, et al. Automatic Online News Topic Ranking Using Media Focus and User Attention Based on Aging Theory[C]. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM'08), Napa Valley, California. New York: ACM, 2008: 1033-1042.
[15] Zheng D H, Li F. Hot Topic Detection on BBS Using Aging Theory[C]. In: Proceedings of the International Conference on Web Information Systems and Mining (WISM'09). Berlin, Heidelberg:Springer-Verlag,2009: 129-138.
[16] Lee Y, Jung H Y, Song W S, et al. Mining the Blogosphere for Top News Stories Identification[C].In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'10), Geneva, Switzerland. New York: ACM, 2010: 395-402.
[17] MetaMap[EB/OL].[2012-09-26]. http://metamap.nlm.nih.gov/.
[18] 王福飞. microRNA在肿瘤诊治中的进展[J]. 江西医药, 2011, 46(6): 580-582.(Wang Fufei. The Progress of microRNA in Cancer Diagnosis and Treatment[J]. Jiangxi Medical Journal, 2011, 46(6): 580-582.)
[19] 侯萍,李剑平.肿瘤干细胞的研究进展[J]. 中国组织工程研究与临床康复,2011,15(14):2629-2632.(Hou Ping, Li Jianping. Advances in Cancer Stem Cell Research[J].Journal of Clinical Rehabilitative Tissue Engineering Research, 2011, 15(14): 2629-2632.)
[1] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[2] Xu Guang,Ren Ming,Song Chengyu. Extracting China’s Economic Image from Western News[J]. 数据分析与知识发现, 2021, 5(5): 30-40.
[3] Dai Bing,Hu Zhengyin. Review of Studies on Literature-Based Discovery[J]. 数据分析与知识发现, 2021, 5(4): 1-12.
[4] Yu Chuanming, Wang Manyi, Lin Hongjun, Zhu Xingyu, Huang Tingting, An Lu. A Comparative Study of Word Representation Models Based on Deep Learning[J]. 数据分析与知识发现, 2020, 4(8): 28-40.
[5] Xia Tian. Extracting Key-phrases from Chinese Scholarly Papers[J]. 数据分析与知识发现, 2020, 4(7): 76-86.
[6] Du Jian. Measuring Uncertainty of Medical Knowledge: A Literature Review[J]. 数据分析与知识发现, 2020, 4(10): 14-27.
[7] Wei Jiaze,Dong Cheng,He Yanqing,Liu Zhihui,Peng Keyun. Detecting News Topics Based on Equalized Paragraph and Sub-topic Vector[J]. 数据分析与知识发现, 2020, 4(10): 70-79.
[8] Peng Guan,Yuefen Wang. Advances in Patent Network[J]. 数据分析与知识发现, 2020, 4(1): 26-39.
[9] Mingxuan Huang,Shoudong Lu,Hui Xu. Cross-Language Information Retrieval Based on Weighted Association Patterns and Rule Consequent Expansion[J]. 数据分析与知识发现, 2019, 3(9): 77-87.
[10] Yanan Yang,Wenhui Zhao,Jian Zhang,Shen Tan,Beibei Zhang. Visualizing Policy Texts Based on Multi-View Collaboration[J]. 数据分析与知识发现, 2019, 3(6): 30-41.
[11] Mengji Zhang,Wanyu Du,Nan Zheng. Predicting Stock Trends Based on News Events[J]. 数据分析与知识发现, 2019, 3(5): 11-18.
[12] Zhang Ning,Yin Lemin,He Lifeng. Impacts of “Poster-Follower” Sentiment on Stock Market Performance[J]. 数据分析与知识发现, 2018, 2(6): 1-12.
[13] Fan Xinyue,Cui Lei. Using Text Mining to Discover Drug Side Effects: Case Study of PubMed[J]. 数据分析与知识发现, 2018, 2(3): 79-86.
[14] Wang Qiangbing,Zhang Chengzhi. Constructing Users Profiles with Content and Gesture Behaviors[J]. 数据分析与知识发现, 2017, 1(2): 80-86.
[15] Xie Xiufang,Zhang Xiaolin. Integrated Analysis and Visualization of Sci-Tech Roadmaps: Case Study of Renewable Energy[J]. 数据分析与知识发现, 2017, 1(1): 16-25.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn