|
|
Research of Mining the Word Category Knowledge for Chinese Syntactic Function Distribution Knowledge Base |
Wang Dongbo1, Zhu Danhao2 |
1. College of Information and Technology Science, Nanjing Agricultural University, Nanjing 210095, China; 2. International Institute for Software Technology, United Nations University, Macao 3058, China |
|
|
Abstract According to the Chinese word syntactic function distribution, the paper constructs syntactic function distribution knowledge in multi-way tree storage structure base based on Tsinghua treebank. The Chinese word category knowledge is mined by using the K-medoids clustering algorithm of Sparse Feature Clustering based on syntactic function distribution knowledge base.
|
Received: 20 November 2012
Published: 14 May 2013
|
|
[1] 陈小荷.从自动句法分析角度看汉语词类问题[J]. 语言教学与研究 ,1999(3):63-72.(Chen Xiaohe. Chinese Words’Classes from the Perspective of Automatic Syntactic Analysis[J].Language Teaching and Research, 1999(3):63-72.) [2] 徐艳华.现代汉语实词语法功能考察及词类体系重构[D].南京:南京师范大学,2006.(Xu Yanhua.Survey on Modern Chinese Notional Word Grammar Function and Reconstructing the POS System[D].Nanjing: Nanjing Normal University,2006.) [3] 陈锋,陈小荷.基于树库的现代汉语短语分布考察[J]. 语言科学 ,2008, 7(1):12-17.(Chen Feng,Chen Xiaohe.A Study on Grammartical Functions of Phrases in Mandarin Chinese Based on Chinese TreeBank[J].Linguistic Sciences,2008, 7(1):12-17.) [4] 卢俊之,陈小荷, 王东波, 等.基于语法功能匹配的汉语句法分析算法[J]. 计算机工程与应用 ,2008,44(16):151-153, 159.(Lu Junzhi,Chen Xiaohe, Wang Dongbo, et al.Chinese Parsing Algorithm Based on Grammar Function Match[J].Computer Engineering and Applications,2008,44(16):151-153,159.) [5] 崔尚卿, 马秀莉, 唐世渭,等.基于不均匀密度的自动聚类算法[J]. 计算机工程 ,2008, 34(23):86-88.(Cui Shangqing, Ma Xiuli, Tang Shiwei, et al.Auto-clustering Algorithm Based on Non-uniform Density[J].Computer Engineering,2008, 34(23):86-88.) [6] 王伟.文本自动聚类技术研究[J]. 情报杂志 ,2009, 28(2):94-96.(Wang Wei.Research on Text Automatic Clustering[J].Journal of Intelligence,2009,28(2):94-96.) [7] 王舵, 郄君, 张娟, 等.一种快速词自动聚类算法[J]. 计算机应用与软件 ,2010, 27(8):277-278.(Wang Duo, Qie Jun, Zhang Juan, et al.A New Algorithm of Words Automatic Clustering[J].Computer Applications and Software,2010, 27(8):277-278.) [8] 潘章明.半监督的自动聚类[J]. 计算机应用 ,2010, 30(10):2614-2617.(Pan Zhangming.Semi-supervised Automatic Clustering[J].Journal of Computer Applications, 2010, 30(10):2614-2617.) [9] 于洪, 储双双.一种基于决策粗糙集的自动聚类方法[J]. 计算机科学 ,2011, 38(1):221-224.(Yu Hong, Chu Shuangshuang.Novel Autonomous Clustering Method Based on Decision-theoretic Rough Set[J].Computer Science,2011, 38(1):221-224.) [10] Boley D, Gini M, Gross R, et al. Partitioning-based Clustering for Web Document Categorization[J]. Decision Support Systems, 1999, 27(3):329-341. [11] Mao J, Jain A K. A Self-organizing Network for Hyperellipsoidal Clustering [J]. IEEE Transactions on Neural Networks, 1996, 7(1):16-29. [12] Cai W, Chen S, Zhang D. Fast and Robust Fuzzy C-means Clustering Algorithms Incorporating Local Information for Image Segmentation[J]. Pattern Recognition, 2007, 40(3):825-838. [13] Chen H H, Lin C J. A Multilingual News Summarizer[C]. In: Proceedings of the 18th International Conference on Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2000:159-165. [14] Leftin L J.Newsblaster Russian-English Clustering Performance Analysis[R].Columbia Computer Science Technical Reports, 2003. [15] Evans D K,Klavans J L,McKeown K R.Columbia Newsblaster: Multilingual News Summarization on the Web Demonstration[C].In: Proceedings of HLT-NAACL 2004. Stroudsburg: Association for Computational Linguistics, 2004:1-4. [16] Mathieu B, Besancon R, Fluhr C. Multilingual Document Clusters Discovery[C]. In: Proceedings of RIAO 2004. 2004:116-125. [17] 周强, 张伟, 俞士汶.汉语树库的构建[J]. 中文信息学报 ,1997(4):42-51. (Zhou Qiang,Zhang Wei,Yu Shiwen.Building a Chinese Treebank[J].Journal of Chinese Information Processing,1997(4): 42-51.) [18] Dhillon I S, Mallela S, Kumar R.A Divisive Information Theoretic Feature Clustering Algorithm for Text Classification[J].The Journal of Machine Learning Research,2003,3(1):1265-1287. [19] Marcus M P,Marcinkiewicz M A,Santorini B.Building a Large Annotated Corpus of English: The Penn Treebank[J].Computational Linguistics,1993,19(2):313-330. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|