The semantic environmental with special stop-words location information control has been studied and founded. This technology has been applied to Chinese metadata CXMARC text automatic indexing and the data mining of theme information. The algorithm of SWF that is used in the pretreatment special Chinese text automatic indexing can reduce the participle different meanings of a field efficiently and shorten indexing time. So tradition maximum matching algorithm has been improved of its quality and efficiency.
王兰成,王立双. 一种基于数字图书馆的文本信息标引技术的改进研究*[J]. 现代图书情报技术, 2006, 1(2): 5-9.
Wang Lancheng,Wang Lishuang. Research on a New Text Automatic Indexing Technology Based on Digital Library. New Technology of Library and Information Service, 2006, 1(2): 5-9.
1 J.F.Martinez-Trinidad. A Tool To Discover The Main Themes. In A Spanish Or English Document,Expert System With Applications,2000,319-327
2 Wolff J E,et al.. Searching and browsing collections of structural information,In:Proc. of the IEEE Advances in Digital Libraries,2000,141-150
3 W.S.Cooper, A.Chen, F.Gey. Experiments in Probabilistic Retrieval of Full Text Documents, Text Retrieval Conference,Gaithersburg,MD, U.S.A., 1994,127-134
4 SaltonG.. Another look at automatic Text Retrieval systems,Communications of ACM,1986,29(7):236-250
5 Gaston H Gonnet, Ricardo A. Baeza-yates and Tim Sinder. New indices for Text:PAT trees and PAT arrays. Information Retrieval Data Structures & Algorithms, Prentice Hall, 1992
6 Fan Jang-Jong, Su Keh-Yih. An efficient algorithm for match multiple patterns. IEEE Trans on Knowledge and Data Engineering, 1993, 5(2):339-351
7 王兰成等. PLS:一种基于信息自动标引的最小推进分词算法及其实现,计算机科学,2002(增刊):24-26
8 田梅. 档案机读目录XML描述及其主题信息自动标引的研究:[学位论文].上海:南京政治学院上海分院信息管理系,2004