New Technology of Library and Information Service  2015, Vol. 31 Issue (2): 15-23    DOI: 10.11925/infotech.1003-3513.2015.02.03
Feature Analysis and Automatic Identification of Query Specificity
Tang Xiangbin1, Lu Wei2, Zhang Xiaojuan1, Huang Shihao1
1. School of Information Management, Wuhan University, Wuhan 430072, China;
2. Center for the Studies of Information Resources, Wuhan University, Wuhan 430072, China
[Objective] This paper constructs a human-annotated collection on the basis of Sogou query logs, aims at feature analysis and automatic identification of query specificity, as well as evaluates and compares the identifing results. [Methods] The queries' basic features and content features are selected and analyzed. And then the decision tree, SVM and Naive Bayes classifiers are built and trained to achieve the automatic query specificity classification. [Results] Using the features mentioned above, an effective query specificty identification is obtained. Finally, the macro average F-measures of the identification effects are all above 0.8. [Limitations] Users' clickthrough information is not selected during the feature selection, and the ignorance of the conditional independence assumption of the Naive Bayes classifier in this particular experiment should be further verified. [Conclusions] The queries' basic features and content features, by themselves, can well distinguish broad, medium, and specific queries.

Key wordsQuery specificity      Decision tree      SVM      Naive Bayes     
Received: 23 April 2014      Published: 17 March 2015
Tang Xiangbin, Lu Wei, Zhang Xiaojuan, Huang Shihao. Feature Analysis and Automatic Identification of Query Specificity. New Technology of Library and Information Service, 2015, 31(2): 15-23.

