|
|
Research on Extraction of Hot Keywords |
Cheng Xiao, Lu Bei, Chen Zhiqun |
Institute of Computer Application Technology, Hangzhou Dianzi University, Hangzhou 310018, China |
|
|
Abstract According to extraction of hot keywords in the multi-phase candidate keywords, the paper tries mass data process,determines the meaningless words based on the timing of statistical law, and proposes Union Variance (UV) concept. The HK (Hot Keywords) formula is constructed based on multi-feature fusion to achieve the extraction of hot keywords. Experimental results show that this method is efficient in the process of hot subject extraction.
|
Received: 16 August 2010
Published: 04 January 2011
|
|
[1] CNNIC发布《第26次中国互联网络发展状况统计报告》 . . http://research.cnnic.cn/html/1279173730d2350.html.
[2] 陆蓓,程肖,谌志群.互联网舆情挖掘研究述略 [J]. 情报资料工作 ,2010(2):41-45.
[3] 邱立坤,陶然,龙志炜,等.面向互联网的话题发现技术研究 . 见: 全国网络与信息安全技术研讨会论文集(下册) . 青岛:中国通信学会,2007:373-379.
[4] 李恒训,张华平,秦鹏,等.基于主题词的网络热点话题发现 . 见: 第五届全国信息检索学术会议论文集 .上海:中国中文信息学会,2009:134-143.
[5] Zhang H P, Liu Q, Yu H K, et al.Chinese Name Entity Recognition Using Role Model [J]. International Journal of Computational Linguistics and Chinese Language Processing, 2003,8(2):29-60.
[6] 化柏林.知识抽取中的停用词处理技术 [J]. 现代图书情报技术 ,2007(8):48-51.
[7] 曾依灵,许洪波,白硕.网络文本主题词的提取与组织研究 [J]. 中文信息学报 ,2008,22(3):64-70,80.
[8] 刘星星,何婷婷,龚海军,等.网络热点事件发现系统的设计 [J]. 中文信息学报 ,2008,22(6):80-85.
[9] 陆蓓,程肖,谌志群.基于改进蚁群聚类的热点主题发现算法研究 [J]. 现代图书情报技术 ,2010(4):66-71.
[10] 丁伟莉,赵华,郑德权,等.中文Bolg热门话题检测与排序技术研究 . 见: 中国中文信息学会二十五周年学术会议论文集 . 北京:中国中文信息学会,2006:282-289.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|