Please wait a minute...
New Technology of Library and Information Service  2015, Vol. 31 Issue (7-8): 80-88    DOI: 10.11925/infotech.1003-3513.2015.07.11
Current Issue | Archive | Adv Search |
User Behavior Analysis Based on Search Engine Log
Tong Guoping, Sun Jianjun
School of Information Management, Nanjing University, Nanjing 210093, China
Download: PDF(3069 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper aims to analyse user behavior based on search engine log. [Methods] Analyse user behavior from query string, query methods, query subjects, user click behavior and user types by word segmentation, statistical analysis, clustering analysis and visualization. [Results] Search users prefer to use 2-5 Chinese noun phrases; Use less colloquial query strings; Dislike using advanced search functions; Perfer to use various query strings; There are peaks and valleys in the number of users. Up-tail phenomenon is confirmed once again in this research. [Limitations] The amount of data used in this paper is not big enough and details of user information is not considered. [Conclusions] Analysis on search engine log is beneficial to acquisition of user behavior characteristics and improving search performance.

Received: 04 February 2015      Published: 25 August 2015
:  TP391  

Cite this article:

Tong Guoping, Sun Jianjun. User Behavior Analysis Based on Search Engine Log. New Technology of Library and Information Service, 2015, 31(7-8): 80-88.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2015.07.11     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2015/V31/I7-8/80

[1] 中国互联网络信息中心. 中国互联网络发展状况统计报告[R/OL]. [2015-02-03]. http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/ hlwtjbg/201502/P020150203548852631921.pdf. (China Internet Network Information Center (CNNIC). Statistical Report on Internet Development in China[R/OL]. [2015-02-03]. http:// www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201502/P020150203548852631921.pdf.)
[2] Silverstein C, Henzinger M, Marais H, et al. Analysis of a Very Large Web Search Engine Query Log [J]. ACM SIGIR Forum, 1998, 33(1): 6-12.
[3] Jansen B J, Spink A, Saracevic T. Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web [J]. Information Processing & Management, 2000, 36(2): 207-227.
[4] 郭岩, 白硕, 杨志峰, 等. 网络日志规模分析和用户兴趣挖掘[J]. 计算机学报, 2005, 28(9): 1483-1496. (Guo Yan, Bai Shuo, Yang Zhifeng, et al. Analyzing Scale of Web Logs and Mining Users' Interests [J]. Chinese Journal of Computers, 2005, 28(9): 1483-1496.)
[5] 余慧佳, 刘奕群, 张敏, 等. 基于大规模日志分析的搜索引擎用户行为分析[J]. 中文信息学报, 2007, 21(1): 109-114. (Yu Huijia, Liu Yiqun, Zhang Min, et al. Research in Search Engine User Behavior Based on Log Analysis [J]. Journal of Chinese Information Processing, 2007, 21(1): 109-114.)
[6] 陈红涛, 杨放春, 陈磊. 基于大规模中文搜索引擎的搜索日志挖掘[J]. 计算机应用研究, 2008, 25(6): 1663-1665. (Chen Hongtao, Yang Fangchun, Chen Lei. Mining Query Log of Large-scale Chinese Search Engine [J]. Application Research of Computers, 2008, 25(6): 1663-1665.)
[7] 赖茂生, 屈鹏. 搜索引擎查询日志的词性标注和挖掘研究[J]. 现代图书情报技术, 2009(4): 50-56. (Lai Maosheng, Qu Peng. The POS & Mining Study on Search Engine's Query Log [J]. New Technology of Library and Information Service, 2009(4): 50-56.)
[8] 刘志杰, 吕学强, 程涛. 搜索引擎日志中"N1+N2"型名词短语研究[J]. 现代图书情报技术, 2010(12): 58-63. (Liu Zhijie, Lv Xueqiang, Cheng Tao. Study on Noun Phrase of "N1+ N2" Structure in Search Engine Query Logs [J]. New Technology of Library and Information Service, 2010(12): 58-63.)
[9] 赵红改, 肖诗斌, 王洪俊, 等. 搜索引擎日志中"N+V"型主谓短语研究[J]. 中文信息学报, 2011, 25(5): 24-29. (Zhao Honggai, Xiao Shibin, Wang Hongjun, et al. Study on Subject-predicate Phrase of "N+V" Structure in Search Engine Query Logs [J]. Journal of Chinese Information Processing, 2011, 25(5): 24-29.)
[10] 马少平, 刘奕群, 刘健, 等.中文搜索引擎用户行为的演化分析[J]. 中文信息学报, 2011, 25(6): 90-97. (Ma Shaoping, Liu Yiqun, Liu Jian, et al. Dynamic Analysis of Chinese Search Engine User Behavior [J]. Journal of Chinese Information Processing, 2011, 25(6): 90-97.)
[11] 唐涛. 基于搜索引擎日志分析的网络舆情监测方法研究[J]. 情报杂志, 2012, 31(8): 27-30. (Tang Tao. Research on Method of Monitoring Net-Mediated Public Sentiment Based on Analysis of Search Engine Logs [J]. Journal of Intelligence, 2012, 31(8): 27-30.)
[12] 董志安, 吕学强. 基于百度搜索日志的用户行为分析[J].计算机应用与软件, 2013, 30(7): 17-20. (Dong Zhian, Lv Xueqiang. User Behavior Analyses Based on Baidu Search Logs [J]. Computer Applications and Software, 2013, 30(7): 17-20.)
[13] 岑荣伟, 刘奕群, 张敏, 等. 基于日志挖掘的搜索引擎用户行为分析[J]. 中文信息学报, 2010, 24(3): 49-54.(Cen Rongwei, Liu Yiqun, Zhang Min, et al. Search Engine User Behavior Analysis Based on Log Mining [J]. Journal of Chinese Information Processing, 2010, 24(3): 49-54.)
[14] 姚婷, 张敏, 刘奕群, 等. 低频查询的用户行为分析和类别研究[J]. 计算机研究与发展, 2012, 49(11): 2368-2375. (Yao Ting, Zhang Min, Liu Yiqun, et al. Empirical Study on Rare Query Categorization [J]. Journal of Computer Research and Development, 2012, 49(11): 2368-2375.)
[15] 周婷婷. 基于海量查询日志的数据挖掘及用户行为分析[D]. 北京: 北京邮电大学, 2012. (Zhou Tingting. Data Mining and User Behavior Analysis Based on the Massive Query Log [D]. Beijing: Beijing University of Posts and Telecommunications, 2012.)
[16] 段建勇, 徐骥超, 张梅. 网络日志中查询串语义关系挖掘及其应用研究[J]. 现代图书情报技术, 2012(1): 58-62. (Duan Jianyong, Xu Jichao, Zhang Mei. Query Semantic Relation Mining from Web Log and Its Application [J]. New Technology of Library and Information Service, 2012(1): 58-62.)

[1] Xiaofeng Li,Jing Ma,Chi Li,Hengmin Zhu. Identifying Commodity Names Based on XGBoost Model[J]. 数据分析与知识发现, 2019, 3(7): 34-41.
[2] Zhongxi You,Weina Hua,Xuelian Pan. Matching Book Reviews and Essential Sentiment Lexicons with Chinese Word Segmenters[J]. 数据分析与知识发现, 2019, 3(7): 23-33.
[3] Peng Guan,Yuefen Wang,Zhu Fu. Analyzing Topic Semantic Evolution with LDA: Case Study of Lithium Ion Batteries[J]. 数据分析与知识发现, 2019, 3(7): 61-72.
[4] Jiahui Hu,An Fang,Wanqing Zhao,Chenliu Yang,Huiling Ren. Annotating Chinese E-Medical Record for Knowledge Discovery[J]. 数据分析与知识发现, 2019, 3(7): 123-132.
[5] Beibei Kong,Jing Xie,Li Qian,Zhijun Chang,Zhenxin Wu. Methodology and Tools to Enrich Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(7): 113-122.
[6] Fan Xuexue, Wang Zhirong, Xu Wu, Liang Yin, Ma Xiaohu. Research on Semantic Similarity Estimation Algorithm of Medical Terminology Based on Medical Ontology[J]. 现代图书情报技术, 2015, 31(12): 57-64.
[7] Ren Haiying, Yu Liting. A Multi-strategy Method for Word Sense Disambiguation Based on Wikipedia[J]. 现代图书情报技术, 2015, 31(11): 18-25.
[8] Du Kun, Liu Huailiang, Guo Lujie. Study on the Modified Method of Feature Weighting with Complex Networks[J]. 现代图书情报技术, 2015, 31(11): 26-32.
[9] Ye Chuan, Ma Jing. Research on Topic Discovery Algoritm of Multimedia Microblog Comments Information[J]. 现代图书情报技术, 2015, 31(11): 51-59.
[10] Xie Xiaqing, Wu Xu. Application of Visualization Technology for “Classic Reading” Platform[J]. 现代图书情报技术, 2015, 31(11): 96-103.
[11] He Yu, Lv Xueqiang, Xu Liping. A Chinese Term Extraction System in New Energy Vehicles Domain[J]. 现代图书情报技术, 2015, 31(10): 88-94.
[12] Du Siqi, Li Honglian, Lv Xueqiang. Research of Chinese Chunk Parsing in Application of the Product Feature Extraction[J]. 现代图书情报技术, 2015, 31(9): 26-30.
[13] Xu Deshan, Li Hui, Zhang Yunliang. A Method of Keywords Annotation Based on Linked Triples[J]. 现代图书情报技术, 2015, 31(9): 31-37.
[14] Dun Wenjie, Sun Yigang, Zhu Xianzhong. Design and Realization of Multimedia Document Structure of Internet TV[J]. 现代图书情报技术, 2015, 31(9): 82-89.
[15] Chen Shiqin, Li Wenjiang. Application of WebSocket in Library Mobile Information Service[J]. 现代图书情报技术, 2015, 31(9): 90-96.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn