In order to resolve the disadvantages of traditional TFIDF in text filtering, the authors propose a text information filtering algorithm based on feature extraction. This paper briefly analyses the text information filtering principles and processes, and then focuses on the design and realization of information filtering algorithm. Experimental results show that the new approach significantly outperforms the traditional information filtering method.
杨陟卓,韩燮. 一种基于特征抽取的文档信息过滤算法研究[J]. 现代图书情报技术, 2008, 24(4): 29-34.
Yang Zhizhuo,Han Xie. An Algorithm of Text Information Filtering Based on Feature Extraction. New Technology of Library and Information Service, 2008, 24(4): 29-34.
[1] Wang H, Li S, Yu S, et al. A Combining Approach to Automatic Keyphrases Indexing for Chinese News Documents[C]. In: A. Gelbukh (Ed.)Computational Linguistics and Intelligent Text Processing (CICLing-2004), Lecture Notes in Computer Science, Springer-Verlag, 2004,2945:435-438.
[2] Li S, Wang H, Yu S, et al. News-Oriented Automatic Chinese Keyword Indexing[C]. In: Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, 2003: 92-97.
[3] Stevens M E. Automatic Indexing: A StateoftheArt Report[R]. Washington, D.C:Government Printing Office, 1970.
[4] Chien L F. PATTreeBased Keyword Extraction for Chinese Information Retrieval[C]. In:Proceedings of the ACM SIGIR International Conference on Information Retrieval, 1997:50-59.
[5] Turney P D. Learning Algorithms for Keyphrase Extraction[J]. Information Retrieval, 2000,2(4):303-336.
[6] 王永成,顾晓明,王丽霞.中文文献主题的自动标引[J].情报学报,1998, 17(3): 212-217.
[7] 张玉叶,李连,刘海见,等.文本过滤中的特征抽取应用研究[J].海军航空工程学院学报, 2005,20(1):139-142.