Please wait a minute...
Data Analysis and Knowledge Discovery  2019, Vol. 3 Issue (2): 79-89    DOI: 10.11925/infotech.2096-3467.2018.0449
Current Issue | Archive | Adv Search |
Automatically Rating Query Ambiguity with Alt-Metrics
Sisi Gui1,2(),Xiaojuan Zhang3,Xin Wang1,2
1School of Information Management, Wuhan University, Wuhan 430072, China
2Institute for Information Retrieval and Knowledge Mining, Wuhan University, Wuhan 430072, China
3School of Computer and Information Science, Southwest University, Chongqing 400715, China
Download: PDF(706 KB)   HTML ( 1
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper aims to find better alt-metrics for automatically rating query ambiguity. [Methods] First, we chose several existing auto-metrics based on documents, users and queries. Then, we modified one of them with query category occurences. Finally, we examined the relationship between the modified alt-metrics and other automatic or human rating metrics. Their correlations were tested with Pearson and symmetric AP correlation coefficients. Their degrees of agreement were tested with macro average accuracy and macro average F1. [Results] The proposed method showed significant relationship with human rating, and achieved F1 of 0.623 and accuracy of 0.707. [Limitations] Only examined the proposed model with data from online directories.[Conclusions] Automatic rating metrics for query ambuiguity can hardly be replaced by other automatic counterparts. Considering the occurences of top-level categories for each query could improve the degrees of agreement for automatic metrics. Compared to the exisiting automatic metrics, the proposed method can be used to replace the human metrics for query ambiguity.

Key wordsQuery Ambiguity Rating      Automatic Rating      Human Rating      Alternativeness      Correlation      Agreement     
Received: 23 April 2018      Published: 27 March 2019

Cite this article:

Sisi Gui,Xiaojuan Zhang,Xin Wang. Automatically Rating Query Ambiguity with Alt-Metrics. Data Analysis and Knowledge Discovery, 2019, 3(2): 79-89.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2018.0449     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2019/V3/I2/79

[1] Calderón-Benavides L, González-Caro C, Baeza-Yates R.Towards a Deeper Understanding of the User’s Query Intent[C]// Proceedings of the SIGIR 2010 Workshop on Query Representation and Understanding. 2010: 21-24.
[2] Nguyen B V, Kan M Y.Functional Faceted Web Query Analysis[C]// Proceedings of the 16th International World Wide Web Conference. 2007.
[3] González-Caro C, Baeza-Yates R.A Multi-faceted Approach to Query Intent Classification[C]// Proceedings of the 18th International Conference on String Processing and Information Retrieval. 2011: 368-379.
[4] Clough P, Sanderson M, Abouammoh M, et al.Multiple Approaches to Analysing Query Diversity[C]// Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2009: 734-735.
[5] Aurelio D N, Mourant R R.The Effects of Web Search Engine Query Ambiguity and Results Sorting Method on User Performance and Preference[J]. Proceedings of the Human Factors and Ergonomics Society Annual Meeting,2002, 46(12): 1271-1275.
[6] Baeza-Yates R, Calderón-Benavides L, González-Caro C.The Intention Behind Web Queries[C]// Proceedings of the 13th International Conference on String Processing and Information Retrieval. 2006: 98-109.
[7] Mendoza M, Baeza-Yates R.A Web Search Analysis Considering the Intention Behind Queries[C]// Proceedings of the 2008 Latin American Web Conference. 2008: 66-74.
[8] Wang Y, Agichtein E.Query Ambiguity Revisited: Clickthrough Measures for Distinguishing Informational and Ambiguous Queries[C]// Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. 2010: 361-364.
[9] Song R, Luo Z, Wen J R, et al.Identifying Ambiguous Queries in Web Search[C]// Proceedings of the 16th International Conference on World Wide Web. ACM, 2007: 1169-1170.
[10] Song R, Luo Z, Nie J Y, et al.Identification of Ambiguous Queries in Web Search[J]. Information Processing and Management, 2009, 45(2): 216-229.
[11] Song R, Dou Z, Hon H W, et al.Learning Query Ambiguity Models by Using Search Logs[J]. Journal of Computer Science and Technology, 2010, 25(4): 728-738.
[12] Pradhan N, Deolalikar V, Li K.Atypical Queries in eCommerce[C]// Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. ACM, 2015: 1767-1770.
[13] Lioma C, Blanco R, Moens M.A Logical Inference Approach to Query Expansion with Social Tags[C]// Proceedings of the 2nd ACM International Conference on the Theory of Information Retrieval. 2009: 358-361.
[14] Lioma C, Ounis I.A Syntactically-based Query Reformulation Technique for Information Retrieval[J]. Information Processing and Management, 2008, 44(1): 143-162.
[15] Welch M J, Cho J, Olston C.Search Result Diversity for Informational Queries[C]// Proceedings of the 20th International Conference on World Wide Web. ACM, 2011: 237-246.
[16] Santos R L T, Macdonald C, Ounis I. Intent-aware Search Result Diversification[C]// Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2011: 595-604.
[17] Ashkan A, Clarke C L A. On the Informativeness of Cascade and Intent-aware Effectiveness Measures[C]// Proceedings of the 20th International Conference on World Wide Web. ACM, 2011: 407-416.
[18] Zhou K, Whiting S, Jose J, et al.The Impact of Temporal Intent Variability on Diversity Evaluation[C]// Proceedings of the 35th European Conference on Advances in Information Retrieval. 2013: 820-823.
[19] Stojanovic N.On Analysing Query Ambiguity for Query Refinement: The Librarian Agent Approach[C]// Proceedings of the 22nd International Conference on Conceptual Modeling. 2003: 490-505.
[20] Qiu G, Liu K, Bu J, et al.Quantify Query Ambiguity Using ODP Metadata[C]// Proceedings of the 30th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007: 697-698.
[21] Cronen-Townsend S, Croft W B.Quantifying Query Ambiguity[C]// Proceedings of the 2nd International Conference on Human Language Technology Research. 2002: 104-109.
[22] Yano Y, Tagami Y, Tajima A.Quantifying Query Ambiguity with Topic Distributions[C]// Proceedings of the 25th ACM Conference on Information and Knowledge Management. 2016: 1877-1880.
[23] Teevan J, Dumais S T, Liebling D J.To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent[C]// Proceedings of the 31st International ACM SIGIR Conference on Research and Development in Information Retrieval. 2008: 163-170.
[24] Fleiss J L.Measuring Nominal Scale Agreement Among Many Raters[J]. Psychological Bulletin, 1971, 76(5): 378-382.
[25] Teevan J, Dumais S T, Horvitz E. Potential for Personalization[J]. ACM Transactions on Computer-Human Interaction, 2010, 17(1): Article No.4.
[26] Dou Z, Song R, Wen J R.A Large-scale Evaluation and Analysis of Personalized Search Strategies[C]// Proceedings of the 16th International Conference on World Wide Web. 2007: 581-590.
[27] Lavrenko V, Croft W B.Relevance Based Language Models[C]// Proceedings of the 24th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2001:120-127.
[28] Clarke C L, Craswell N, Soboroff I.Overview of the TREC 2009 Web Track[C]// Proceedings of the Text Retrieval Conference. 2009.
[29] Clarke C L, Craswell N, Soboroff I, et al.Overview of the TREC 2010 Web Track[C]// Proceedings of the Text Retrieval Conference. 2010.
[30] Clarke C L, Craswell N, Soboroff I, et al.Overview of the TREC 2011 Web Track[C]// Proceedings of the Text Retrieval Conference. 2011.
[31] Clarke C L, Craswell N, Voorhees E M.Overview of the TREC 2012 Web Track[C]// Proceedings of the Text Retrieval Conference. 2012.
[32] Yilmaz E, Aslam J A, Robertson S.A New Rank Correlation Coefficient for Information Retrieval[C]// Proceedings of the 31st International ACM SIGIR Conference on Research and Development in Information Retrieval. 2008: 587-594.
[33] Cohen J.Statistical Power Analysis for the Behavioral Sciences[M]. L. Erlbaum Associates, 1988.
[34] Sokolova M, Lapalme G.A Systematic Analysis of Performance Measures for Classification Tasks[J]. Information Processing and Management, 2009, 45(4): 427-437.
[1] Ming Yi,Tingting Zhang. Ranking Answer Quality of Popular Q&A Community[J]. 数据分析与知识发现, 2019, 3(6): 12-20.
[2] Jiang Wu,Yinghui Zhao,Jiahui Gao. Research on Weibo Opinion Leaders Identification and Analysis in Medical Public Opinion Incidents[J]. 数据分析与知识发现, 2019, 3(4): 53-62.
[3] Daoping Wang,Zhongyang Jiang,Boqing Zhang. Collaborative Filtering Algorithm Based on Gray Correlation Analysis and Time Factor[J]. 数据分析与知识发现, 2018, 2(6): 102-109.
[4] Pengmin Wu,Ting Chen,Xiaomei Wang. The Correlation Between Altmetrics and Citations[J]. 数据分析与知识发现, 2018, 2(6): 58-69.
[5] Jingqi Wang,Rui Li,Huayi Wu. The Evolution of Online Public Opinion Based on Spatial Autocorrelation[J]. 数据分析与知识发现, 2018, 2(2): 64-73.
[6] Dong Li,Shouchuan Tong,Jiang Li. Analyzing Interdisciplinarity and Scientists’ Academic Impacts[J]. 数据分析与知识发现, 2018, 2(12): 1-11.
[7] Yue He,Can Zhu. Sentiment Analysis of Weibo Opinion Leaders——Case Study of “Illegal Vaccine” Event[J]. 数据分析与知识发现, 2017, 1(9): 65-73.
[8] Su Zhang. Information Consumption of Urban Chinese Residents: An Empirical Study Based on Dynamic Spatial Durbin Panel Model[J]. 数据分析与知识发现, 2017, 1(5): 52-61.
[9] Xing Wei,Dehua Hu,Minhan Yi,Qizhen Zhu,Wenjie Zhu. Extracting Disease-Gene-Drug Correlations Based on Data Cube[J]. 数据分析与知识发现, 2017, 1(10): 94-104.
[10] Yu Liping. New Method to Evaluate Academic Journals: Case Study of Mathematics Journals[J]. 现代图书情报技术, 2016, 32(7-8): 94-100.
[11] Xiao Xuebin,Chai Yanju. Properties of Scholarly Papers and Number of Citations[J]. 现代图书情报技术, 2016, 32(6): 46-53.
[12] Zhu Ling,Xue Chunxiang,Zhang Chengzhi,Fu Zhu. User Tags and Microblog Posts: Case Study of Sina Weibo[J]. 现代图书情报技术, 2016, 32(3): 18-24.
[13] Duan Yufeng, Zhu Wenjing, Chen Qiao, Liu Wei, Liu Fenghong. A Domain Concepts Triple-layer Filter Method[J]. 现代图书情报技术, 2015, 31(4): 26-33.
[14] Li Junfeng, Lv Xueqiang, Zhou Shaojun. Patent Keyword Indexing Based on Weighted Complex Graph Model[J]. 现代图书情报技术, 2015, 31(3): 26-32.
[15] Bi Dayu, Xia Xiaoxu, Wang Jing. The Analysis of Credit Standing of E-businessman Based on the Data Mining of Users' Online Evaluation[J]. 现代图书情报技术, 2014, 30(7): 77-83.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn