Please wait a minute...
New Technology of Library and Information Service  2006, Vol. 1 Issue (12): 81-84    DOI: 10.11925/infotech.1003-3513.2006.12.20
Current Issue | Archive | Adv Search |
Research and Inplementation of Chinese Web-text Clustering
Yang Xueming
(Network Center, Ningbo University, Ningbo 315211, China)
Export: BibTeX | EndNote (RIS)      

The text automatic clustering has been put forward and studied in application. This paper proposes a text clustering framework by coalescent the HAC and K-Means clustering algorithm, and evaluates this framework in an experiment.

Key wordsText automatic cluster      Information retrieval      HAC      K-Means     
Received: 14 September 2006      Published: 25 December 2006


Corresponding Authors: Yang Xueming     E-mail:
About author:: Yang Xueming

Cite this article:

Yang Xueming . Research and Inplementation of Chinese Web-text Clustering. New Technology of Library and Information Service, 2006, 1(12): 81-84.

URL:     OR

2Han J, Kamber M. Data Mining: Concepts and Techniques . Morgan Kaufmann Publishers,2001,14-22
3方开泰.实用多元统计分析.华东师范大学出版社,1986 ,43-55
4Yang Y, Pedersen J P. Feature selection in statistical learning of text categorization. In the 14th Int.Conf.on Machine Learning,1997.412-420
5代六玲 等.中文文本分类中特征抽取方法的比较研究.中文信息学报,2004,18(1):26-32
7Schtze H, Silverstein C. Projections for Efficient Document Clustering, in ACM/SIGIR (1997), 74-81
9Fazli C, Esen A. Ozkarahan. Concepts and Effectiveness of the Cover-Coefficient-Based Clustering Methodology for Text Database. ACM Transcations on Database Systems,1990,15(4):64-78
10Modha D, Spangler S. Feature weighting in kmeans clustering. Machine Learning, 2003,52(3):217-237

[1] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[2] Meng Zhen,Wang Hao,Yu Wei,Deng Sanhong,Zhang Baolong. Vocal Music Classification Based on Multi-category Feature Fusion[J]. 数据分析与知识发现, 2021, 5(5): 59-70.
[3] Li Yueyan,Wang Hao,Deng Sanhong,Wang Wei. Research Trends of Information Retrieval——Case Study of SIGIR Conference Papers[J]. 数据分析与知识发现, 2021, 5(4): 13-24.
[4] Mingxuan Huang,Shoudong Lu,Hui Xu. Cross-Language Information Retrieval Based on Weighted Association Patterns and Rule Consequent Expansion[J]. 数据分析与知识发现, 2019, 3(9): 77-87.
[5] Tingxin Wen,Yangzi Li,Jingshuang Sun. News Hotspots Discovery Method Based on Multi Factor Feature Selection and AFOA/K-means[J]. 数据分析与知识发现, 2019, 3(4): 97-106.
[6] Sun Haixia,Wang Lei,Wu Yingjie,Hua Weina,Li Junlian. Matching Strategies for Institution Names in Literature Database[J]. 数据分析与知识发现, 2018, 2(8): 88-97.
[7] Liu Hongwei,Gao Hongming,Chen Li,Zhan Mingjun,Liang Zhouyang. Identifying User Interests Based on Browsing Behaviors[J]. 数据分析与知识发现, 2018, 2(2): 74-85.
[8] Jia Xiaoting,Wang Mingyang,Cao Yu. Automatic Abstracting of Chinese Document with Doc2Vec and Improved Clustering Algorithm[J]. 数据分析与知识发现, 2018, 2(2): 86-95.
[9] Liu Minghui. Risk Assessment of Civil Aviation Terrorism Based on K-means Clustering[J]. 数据分析与知识发现, 2018, 2(10): 21-26.
[10] Wang Xueying,Zhang Zixuan,Wang Hao,Deng Sanhong. Evaluating Brands of Agriculture Products: A Literature Review[J]. 数据分析与知识发现, 2017, 1(7): 13-21.
[11] Yang Chaofan,Deng Zhonghua,Peng Xin,Liu Bin. Review of Information Retrieval Research: Case Study of Conference Papers[J]. 数据分析与知识发现, 2017, 1(7): 35-43.
[12] Guan Qin,Deng Sanhong,Wang Hao. Chinese Stopwords for Text Clustering: A Comparative Study[J]. 数据分析与知识发现, 2017, 1(3): 72-80.
[13] Fang Xiaofei,Huang Xiaoxi,Wang Rongbo,Chen Zhiqun,Wang Xiaohua. Identifying Hot Topics from Mobile Complaint Texts[J]. 数据分析与知识发现, 2017, 1(2): 19-27.
[14] Liu Ruilun,Ye Wenhao,Gao Ruiqing,Tang Mengjia,Wang Dongbo. Research on Text Clustering Based on Requirements of Big Data Jobs[J]. 数据分析与知识发现, 2017, 1(12): 32-40.
[15] Zhang Xiaojuan,Han Yi. Reviews on Temporal Information Retrieval[J]. 数据分析与知识发现, 2017, 1(1): 3-15.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938