|
|
Online Public Opinion Hotspot Detection and Analysis Based on Document Clustering |
Wang Wei Xu Xin |
(Department of InformaticsEast China Normal University,Shanghai 200241,China) |
|
|
Abstract According to the requirement of online public opinion analysis, this paper builds an online public opinion hotspot detection and analysis system based on document clustering. It builds vector space model by abstracting document features from sample Web pages, and get the hot-spot cluster by OPTICS algorithm. According the vector of hot-spot cluster, the Web pages are clustered for the second time. At last, it gets the time evolution mode about the public opinion to afford decision support for specific field,and improves the quality of page correlation and analyze the public opinion more accurately.
|
Received: 12 January 2009
Published: 25 March 2009
|
|
Corresponding Authors:
Wang Wei
E-mail: asdwangwei@yahoo.com.cn
|
About author:: Wang Wei,Xu Xin |
[1] 中国互联网络信息中心.第22次中国互联网络发展状况统计报告[EB/OL].[2008-07-23].http://www.cnnic.cn/uploadfiles/pdf/2008/7/23/170516.pdf.
[2] 李晓黎. WEB信息检索与分类中的数据采掘研究[D].北京:中国科学院计算技术研究所,2001:61-90.
[3] ICTCLAS简介[EB/OL]. [2008-12-01].http://ictclas.org/sub_1_1.html.
[4] 姚清耘.基于向量空间模型的中文文本聚类方法的研究[D].上海:上海交通大学,2008.
[5] 孙学刚,陈群秀,马亮.基于主题的Web文档聚类研究[J].中文信息学报,2003(3):12-16.
[6] 郭建永,蔡永,甑艳霞.基于文本聚类技术的主题发现[J].计算机工程与设计,2008(6):1426-1428.
[7] 徐文海,温有奎.一种基于TFIDF方法的中文关键词抽取算法[J].信息系统,2008(2):298-301.
[8] 刘群,李素建.基于《知网》的词汇语义相似度计算[A].第三届汉语词汇语义学研讨会,2002. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|