Aimming at the problem of training and test corpus in text classing, we have built a super classed and denoted corpus, which has abundant field information, scientific class system, extensible storage format and structured semantic denotations. It adapts to the construction of training and test corpus for text classing、topic identify and IR.
刘华 . 超大规模分类语料库的构建[J]. 现代图书情报技术, 2006, 22(1): 71-73.
Liu Hua. Construction of a Super Classed and Denoted Corpus. New Technology of Library and Information Service, 2006, 22(1): 71-73.