New Technology of Library and Information Service  2014, Vol. 30 Issue (1): 72-78    DOI: 10.11925/infotech.1003-3513.2014.01.11
Chinese Organization Name Recognition in User Query Log
Guan Xiaoda1, Lv Xueqiang1, Li Zhuo1, Zheng Luexing1, 2
1Beijing Key Laboratory of Internet Culture and Digital Dissemination Research,Beijing Information Science and Technology University,Beijing 100101,China; 2Institute of Computational Linguistics,Peking University,Beijing 100871,China
Abstract  [Objective] To solve the problems of query log annotated data shortage and information asymmetry in user query log organization name recognition. [Methods] The paper proposes an automatic method to create training data,which abates the insufficient of user query log annotated data. The authors cite the adhesion features and constructed CRF model to recognize organization names by integrating context information. [Results] Experiments on Sogou user query log show that precision rate can reach 72.80%,recall rate can reach 86.73% and F-measure can reach 79.16%. The method improves F-measure by 30% comparing with the traditional organization name recognition method. [Limitations] The model error using auto-created training set will be greater than standard annotated user query log data.The scale of organization name set will affect the completeness of the model’s context knowledge. [Conclusions] Experiment results demonstrate that the method is effective.
Key wordsUser query log      Chinese organization name      Corpus construction      Adhesion feature      CRF     
Received: 14 February 2014      Published: 14 February 2014
:  TP391  

Guan Xiaoda,Lv Xueqiang,Li Zhuo,Zheng Luexing,. Chinese Organization Name Recognition in User Query Log. New Technology of Library and Information Service, 2014, 30(1): 72-78.

