Automatic Recognition of Exploratory and Lookup Intents Based on Berry Picking Model
Liu Jie1,Gui Sisi2,Zhang Xiaojuan3()
1College of Computer and Information Science, Southwest University, Chongqing 400715, China 2College of Information Management, Nanjing Agricultural University, Nanjing 210095, China 3School of Public Administration, Sichuan University, Chengdu 610065, China
[Objective] This paper selects several new classification features to improve the accuracy of automatic recognition of exploratory and lookup intents. [Methods] Firstly, we collected 1805 queries from the AOL search log and manually labelled them. Then, we proposed classification features from three aspects: query nature, search process, and information source inspired by the Berry Picking model. Third, we evaluated the performance of the proposed features in Naive Bayes, SVM, Decision Tree, Random Forest, and Neural Network. Finally, we explored the classification performance of individual features and feature sets. [Results] The three types of classification features can effectively distinguish exploratory and lookup intentions, with query nature-based features achieving the best performance. Among the five classification models, the neural network algorithm-based model performed the best (Accuracy=0.817 2,Precision=0.849 4,Recall=0.774 7,F1 Score=0.810 3). [Limitations] We did not examine the performances of newly proposed classification features with multiple datasets. User searching behaviors need to be fully explored to form more effective classification features. Moreover, the dataset applied to exploratory/lookup intent recognition was limited due to the high time consumption and labor cost of manual labelling. [Conclusions] The proposed features based on the Berry Picking model can effectively distinguish between exploratory and lookup intents.
刘杰, 桂思思, 张晓娟. 基于采莓模型启示的探索式与查找式意图自动识别研究*[J]. 数据分析与知识发现, 2024, 8(4): 152-166.
Liu Jie, Gui Sisi, Zhang Xiaojuan. Automatic Recognition of Exploratory and Lookup Intents Based on Berry Picking Model. Data Analysis and Knowledge Discovery, 2024, 8(4): 152-166.
Broder A. A Taxonomy of Web Search[J]. ACM SIGIR Forum, 2002, 36(2): 3-10.
[2]
White R W, Roth R A. Exploratory Search: Beyond the Query-Response Paradigm[J]. Synthesis Lectures on Information Concepts, Retrieval, and Services, 2009, 1(1): 1-98.
[3]
Marchionini G. Exploratory Search: From Finding to Understanding[J]. Communications of the ACM, 2006, 49(4): 41-46.
(Yuan Hong. Research on the Influence Mechanism of Exploratory Seeking Intention on Seeking Strategy Selection[J]. Library and Information Service, 2022, 66(3): 93-105.)
doi: 10.13266/j.issn.0252-3116.2022.03.011
[5]
Crescenzi A, Ward A R, Li Y, et al. Supporting Metacognition during Exploratory Search with the OrgBox[C]// Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2021: 1197-1207.
[6]
Bates M J. The Design of Browsing and Berrypicking Techniques for the Online Search Interface[J]. Online Review, 1989, 13(5): 407-424.
[7]
Savolainen R. Berrypicking and Information Foraging: Comparison of Two Theoretical Frameworks for Studying Exploratory Search[J]. Journal of Information Science, 2018, 44(5): 580-593.
(Fu Yuwen, Ma Zhirou, Liu Jie, et al. Deep Active Learning Method for Question Intention Recognition[J]. Journal of Chinese Information Processing, 2021, 35(4): 92-99, 109.)
(He Guoxiu, Zhang Xiaojuan. Discussion on the Improvement of Methods for Automatic Classification of Query Intent[J]. Digital Library Forum, 2018(1): 53-60.)
(Zheng Xin. Research on the Method of Identifying and Analyzing User's Query Intention in E-commerce Platform[J]. Journal of Xi'an University (Natural Science Edition), 2021, 24(1): 29-33.)
(Zhao Yiming, Pan Pei, Mao Jin. Recognizing Intensity of Medical Query Intentions Based on Task Knowledge Fusion and Text Data Enhancement[J]. Data Analysis and Knowledge Discovery, 2023, 7(2): 38-47.)
(Song Xiaoxuan, Liu Chang. Exploring Users Information Source Selection and Use Strategies in Learning Related Search[J]. Journal of the China Society for Scientific and Technical Information, 2019, 38(6): 655-666.)
[14]
Soufan A, Ruthven I, Azzopardi L. Searching the Literature: An Analysis of an Exploratory Search Task[C]// Proceedings of the 2022 Conference on Human Information Interaction and Retrieval. New York: ACM, 2022: 146-157.
[15]
Rha E Y, Mitsui M, Belkin N J, et al. Exploring the Relationships Between Search Intentions and Query Reformulations[C]// Proceedings of the 79th ASIS&T Annual Meeting on Creating Knowledge, Enhancing Lives through Information & Technology. New York: ACM, 2016: 1-9.
(Yang Jie, Xu Yue, Yu Jianqiao, et al. User Query Intention Classification Based on Search Engine Log[J]. Command Information System and Technology, 2019, 10(2): 74-79.)
(Wang Ruixue, Fang Jing, Gui Sisi, et al. Deep Learning-Based Algorithm for Academic Query Intent Classification[J]. Library and Information Service, 2021, 65(3): 93-99.)
doi: 10.13266/j.issn.0252-3116.2021.03.012
[18]
Tsai M F, Wu Y H. User Intent Prediction Search Engine System Based on Query Analysis and Image Recognition Technologies[J]. The Journal of Supercomputing, 2023, 79(5): 5327-5359.
[19]
Kim Y M, Lee T H, Na S O. Constructing Novel Datasets for Intent Detection and NER in a Korean Healthcare Advice System: Guidelines and Empirical Results[J]. Applied Intelligence, 2023, 53(1): 941-961.
[20]
Yu R, Limock, Dietze S. Still Haven't Found What You're Looking for-Detecting the Intent of Web Search Missions from User Interaction Features[OL]. arXiv Preprint, arXiv:2207.01256.
[21]
Zamora J, Mendoza M, Allende H. Query Intent Detection Based on Query Log Mining[J]. Journal of Web Engineering, 2014, 13(1-2): 24-52.
[22]
Hashemi H, Zamani H, Croft W B. Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations[C]// Proceedings of the 31st ACM International Conference on Information and Knowledge Management. New York: ACM, 2022: 4003-4008.
[23]
Natarajan V, Shanmuganathan S. Understanding Query Intention in Search Queries of Learners in Blended Learning Environments[C]// Proceedings of the 4th International Conference on Cybernetics, Cognition and Machine Learning Applications (ICCCMLA). Piscataway: IEEE, 2022: 308-315.
[24]
Wang Y Q, Wang S, Li Y Y, et al. Recognizing Medical Search Query Intent by Few-Shot Learning[C]// Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2022: 502-512.
[25]
Shah C, Hendahewa C, González-Ibáñez R. Rain or Shine? Forecasting Search Process Performance in Exploratory Search Tasks[J]. Journal of the Association for Information Science and Technology, 2016, 67(7): 1607-1623.
[26]
Etcheverry M, Moulin-Frier C, Oudeyer P Y. Hierarchically Organized Latent Modules for Exploratory Search in Morphogenetic Systems[C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. New York: ACM, 2020: 4846-4859.
[27]
Ward A R, Capra R. OrgBox: Supporting Cognitive and Metacognitive Activities During Exploratory Search[C]// Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2021: 2570-2574.
(Xing Yuyan, Liu Ping. Research on Users' Cognitive Structure Changes Before and After Exploratory Search[J]. Library and Information Service, 2021, 65(22): 74-84.)
doi: 10.13266/j.issn.0252-3116.2021.22.008
[29]
Athukorala K, Oulasvirta A, Głowacka D, et al. Narrow or Broad?: Estimating Subjective Specificity in Exploratory Search[C]// Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. New York: ACM, 2014: 819-828.
[30]
Athukorala K, Głowacka D, Jacucci G, et al. Is Exploratory Search Different? A Comparison of Information Search Behavior for Exploratory and Lookup Tasks[J]. Journal of the Association for Information Science and Technology, 2016, 67(11): 2635-2651.
(Yuan Hong, Huang Yan. On the Comparation of Lookup and Exploratory Search Behavior—A Case Study of Health Information[J]. Journal of Modern Information, 2019, 39(5): 48-56.)
doi: 10.3969/j.issn.1008-0821.2019.05.007
[32]
Agarwal M K, Sahu T. Lookup or Exploratory: What Is Your Search Intent?[OL]. arXiv Preprint, arXiv:2110.04640.
[33]
Khattak A, Habib A, Asghar M Z, et al. Applying Deep Neural Networks for User Intention Identification[J]. Soft Computing, 2021, 25(3): 2191-2220.
[34]
Hassan A, White R W, Dumais S T, et al. Struggling or Exploring?: Disambiguating Long Search Sessions[C]// Proceedings of the 7th ACM International Conference on Web Search and Data Mining. New York: ACM, 2014: 53-62.
(Wei Li, Zhang Yunqiu. A Study on Exploratory Search Behaviors Based on Log Mining[J]. Library and Information Service, 2014, 58(11): 21-28.)
doi: 10.13266/j.issn.0252-3116.2014.11.003
[36]
Mikolov T, Sutskever I, Chen K, et al. Distributed Representations of Words and Phrases and Their Compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. New York: ACM, 2013: 3111-3119.
[37]
Kinley K, Tjondronegoro D, Partridge H, et al. Human-Computer Interaction: The Impact of Users’ Cognitive Styles on Query Reformulation Behaviour During Web Searching[C]// Proceedings of the 24th Australian Computer-Human Interaction Conference. New York: ACM, 2012: 299-307.
[38]
Kinley K, Tjondronegoro D, Partridge H, et al. Relationship between the Nature of the Search Task Types and Query Reformulation Behaviour[C]// Proceedings of the 17th Australasian Document Computing Symposium. New York: ACM, 2012: 39-46.
[39]
White R W, Huang J. Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs[C]// Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2010: 587-594.
[40]
马超. 面向探索式搜索过程的查询推荐方法研究[D]. 沈阳: 东北大学, 2018.
[40]
(Ma Chao. Research on Query Recommendation Methods Supporting Exploratory Search[D]. Shenyang: Northeastern University, 2018.)
[41]
White R W, Drucker S M. Investigating Behavioral Variability in Web Search[C]// Proceedings of the 16th International Conference on World Wide Web. New York: ACM, 2007: 21-30.
[42]
Kang I H, Kim G C. Query Type Classification for Web Document Retrieval[C]// Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2003: 64-71.
[43]
Ma C, Zhang B. A New Query Recommendation Method Supporting Exploratory Search Based on Search Goal Shift Graphs[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(11): 2024-2036.
[44]
Craswell N, Campos D, Mitra B, et al. ORCAS: 18 Million Clicked Query-Document Pairs for Analyzing Search[C]// Proceedings of the 29th ACM International Conference on Information & Knowledge Management. New York: ACM, 2020: 2983-2989.
[45]
Yin D, Tan J W, Zhang Z, et al. Learning to Generate Personalized Query Auto-Completions via a Multi-View Multi-Task Attentive Approach[C]// Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2020: 2998-3007.
[46]
Li P F, Zhang Y, Zhang B. Understanding Query Combination Behavior in Exploratory Searches[J]. Applied Sciences, 2022, 12(2): 706-725.
[47]
Fleiss J L, Cohen J. The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability[J]. Educational & Psychological Measurement, 1973, 33(3):613-619.