Please wait a minute...
Advanced Search
现代图书情报技术  2016, Vol. 32 Issue (11): 34-43    DOI: 10.11925/infotech.1003-3513.2016.11.05
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
1武汉大学信息管理学院 武汉 430072
2武汉大学信息资源研究中心 武汉 430072
The Impacts of Query Specificity on Information Retrieval
Ren Ke1,Lu Wei1,2(),Ding Heng1
1School of Information Management, Wuhan University, Wuhan 430072, China
2Center for the Study of Information Resources, Wuhan University, Wuhan 430072, China
全文: PDF(1741 KB)   HTML ( 40
输出: BibTeX | EndNote (RIS)      

目的】针对不同查询专指度语句的检索效果进行全面分析, 为改善搜索引擎性能、提高用户检索体验提供借鉴。【方法】基于TREC Web Track查询语句, 人工构建查询专指度标注集, 选用语言模型狄利克雷平滑、语言模型线性插值平滑和BM25三种模型, 以常用的信息检索评价指标为基准, 探讨查询专指度强弱对检索效果在不同层次上的影响。【结果】在最靠前的几条检索结果中, 强弱专指度查询语句的检索效果差异最大, 强专指度的检索效果要明显好于弱专指度。【局限】仅在TREC数据集上进行实验测试, 还需在其他数据集上进一步检验。【结论】搜索引擎在专指度这一维度下, 应重点关注最靠前的几条检索结果的准确性, 以此为切入点改善检索模型。

E-mail Alert
关键词 查询意图查询专指度检索效果    

[Objective] This paper analyzes the impacts of query specificity on the effectiveness of information retrieval systems, aiming to improve the performance of search engine and user experience. [Methods] First, we manually constructed a labeling set for queries from the TREC Web Track. Second, we adopted the Dirichlet language model, linear interpolation language model and BM25 model to examine each query’s performance. Finally, we used the average information retrieval evaluation index as the benchmark to explore the impacts of query specificity. [Results] For the highest-ranked results, the queries with narrower specificity had better retrieval performance than their boarder counterparts. [Limitations] The proposed method was only examined with data provided by TREC. More studies were needed to evaluate its performance with other data sets. [Conclusions] Search engines should focus on the precision of the highest ranked results, and then modify their retrieval model accordingly.

Key wordsQuery intention    Query specificity    Retrieval result
收稿日期: 2016-07-18     
任珂,陆伟,丁恒. 查询专指度对检索效果的影响研究[J]. 现代图书情报技术, 2016, 32(11): 34-43.
Ren Ke,Lu Wei,Ding Heng. The Impacts of Query Specificity on Information Retrieval. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2016.11.05.
[1] 王娜, 陈会敏. 泛在网络中信息过载危害及原因的调查分析[J]. 情报理论与实践, 2014, 37(11): 20-25.
[1] (Wang Na, Chen Huimin.Investigation on the Harm and Cause of Information Overload in Ubiquitous Network[J]. Information Studies: Theory & Application, 2014, 37(11): 20-25.)
[2] Jones K S.A Statistical Interpretation of Term Specificity and Its Application in Retrieval[J]. Journal of Documentation, 1972, 28(1): 11-21.
[3] Kim G.Relationship Between Index Term Specificity and Relevance Judgment[J]. Information Processing & Management, 2006, 42(5): 1218-1229.
[4] 唐祥彬, 陆伟, 张晓娟, 等. 查询专指度特征分析与自动识别[J]. 现代图书情报技术, 2015(2): 15-23.
[4] (Tang Xiangbin, Lu Wei, Zhang Xiaojuan, et al.Feature Analysis and Automatic Identification of Query Specificity[J]. New Technology of Library and Information Service, 2015(2): 15-23.)
[5] 宋巍. 基于主题的查询意图识别研究[D].哈尔滨: 哈尔滨工业大学, 2013.
[5] (Song Wei.Research on Topic Based Query Intent Identification [D]. Harbin: Harbin Institute of Technology, 2013.)
[6] Broder A.A Taxonomy of Web Search[J]. ACM SIGIR Forum, 2002, 36(2): 3-10.
[7] Rose D E, Levinson D.Understanding User Goals in Web Search [C]. In: Proceedings of the 13th International Conference on World Wide Web. New York, NY, USA: ACM, 2004: 13-19.
[8] Baeza-Yates R, Calderón-Benavides L, González-Caro C.The Intention Behind Web Queries [C]. In: Proceedings of the 13th International Conference on String Processing and Information Retrieval. Berlin, Heidelberg: Springer-Verlag, 2006: 98-109.
[9] González-Caro C, Calderón-Benavides L, Baeza-Yates R, et al.Web Queries: The Tip of the Iceberg of the User’s Intent [C]. In: Proceedings of the 4th ACM WSDM Conference, Hong Kong, China. 2011.
[10] Hafernik C T.The Relationship Between Query Length, Parts of Speech Usage and Web Search Query Specificity [D]. The Pennsylvania State University, 2013.
[11] Tamine L, Chouquet C, Palmer T.Analysis of Biomedical and Health Queries: Lessons Learned from TREC and CLEF Evaluation Benchmarks[J]. Journal of the Association for Information Science and Technology, 2015, 66(12): 2626-2642.
[12] Phan N, Bailey P, Wilkinson R.Understanding the Relationship of Information Need Specificity to Search Query Length [C]. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’07). New York: ACM, 2007: 709-710.
[13] Mu X, Lu K.Improving UMLS Metathesaurus Query Expansion Based on the Query Specificity and Length [C]. In: Proceedings of the ACM SIGKDD Workshop on Health Informatics.2012.
[14] Heine M H.An Investigation of the Relative Influences of Database Informativeness, Query Size and Query Term Specificity on the Effectiveness of Medline Searching[J]. Journal of Information Science, 1995, 21(3): 173-185.
[15] Ingwersen P, Jarvelin K.The Turn: Integration of Information Seeking and Retrieval in Context[M]. Springer, 2005.
[16] Ramírez G, de Vries A P. Relevant Contextual Features in XML Retrieval [C]. In: Proceedings of the 1st International Conference on Information Interaction in Context. New York: ACM, 2006: 56-65.
[17] Carletta J.Assessing Agreement on Classification Tasks: The Kappa Statistic[J]. Computational Linguistics, 1996, 22(2): 249-254.
[18] Siegel S, Castellan N J.Non-parametric Statistics for the Behavioral Sciences[J]. American Catholic Sociological Review, 1957, 18(2). DOI: 10.2307/3708383.
[19] Ponte J M, Croft W B.A Language Modeling Approach to Information Retrieval [C]. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 1998: 275-281.
[20] Robertson S E, Jones K S.Relevance Weighting of Search Terms[J]. Journal of the American Society for Information Science, 1976, 27(3): 129-146.
[1] 张晓娟, 韩毅. 时态信息检索研究综述*[J]. 数据分析与知识发现, 2017, 1(1): 3-15.
[2] 唐祥彬, 陆伟, 张晓娟, 黄诗豪. 查询专指度特征分析与自动识别[J]. 现代图书情报技术, 2015, 31(2): 15-23.
[3] 唐静笑,吕学强,柳成洋,李涵. 用户查询意图的层次化识别方法*[J]. 现代图书情报技术, 2014, 30(1): 36-42.
[4] 张晓娟, 陆伟. 利用查询重构识别查询意图[J]. 现代图书情报技术, 2013, 29(1): 8-14.
[5] 周之诚. 基于查询意图聚类的实时搜索建议[J]. 现代图书情报技术, 2011, 27(2): 87-93.
Full text



版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190