Please wait a minute...
Advanced Search
现代图书情报技术  2014, Vol. 30 Issue (12): 10-17     https://doi.org/10.11925/infotech.1003-3513.2014.12.02
  数字图书馆 本期目录 | 过刊浏览 | 高级检索 |
网络用户搜索行为特征分析
陈勇1, 李红莲1, 吕学强2
1. 北京信息科技大学信息与通信工程学院 北京 100101;
2. 北京信息科技大学网络文化与数字传播北京市重点实验室 北京 100101
Analysis for the Search Behavior of Web Users
Chen Yong1, Li Honglian1, Lv Xueqiang2
1. School of Information and Communication Engineering, Beijing Information Science and Technology University, Beijing 100101, China;
2. Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing 100101, China
全文: PDF (566 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

[目的]对网络用户行为的有关数据进行统计、分析, 为进一步提高搜索引擎的性能提供依据.[方法]分析用户搜索词特点; 对搜索引擎返回用户搜索结果进行分析; 借用熵的概念, 对用户的点击情况进行量化分析.[结果]在所有用户记录中, 无空格搜索占93.66%, 其中83.59%的用户使用较长搜索词串; 用户确定性点击达到64.26%; 71.26%的用户查看了前三个返回结果.[局限]搜索用户的规模在一定程度上影响分析结果.[结论]实验结果表明, 用户点击的可靠性与确定性密切相关, 搜索引擎对较长搜索词的关键词定位存在一定缺陷.

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
吕学强
李红莲
陈勇
关键词 用户行为日志分析搜索引擎信息熵    
Abstract

[Objective] To count and analyze for the data of Web users behavior, provide the basis for further improving the performance of search engines. [Methods] Analyze the characteristics of users' query and the user's query results that the search engine returns. To introduce the concept of entropy, quantify the behavior of interaction process of users and search engines. [Results] In all user records, no spaces queries accounted for 93.66%, 83.59% of the users use a longer query, user's certainty click reaches 64.26%, and 71.26% of the users view the first three return results. [Limitations] The size of the user's query may affect the result of the analysis in a certain extent. [Conclusions] The results show that the user's click on the reliability is closely related to the certainty, search engine has some defects on positioning of the long query words.

Key wordsUser behavior    Log analysis    Search engine    Entropy
收稿日期: 2014-06-26      出版日期: 2015-01-20
:  TP391  
基金资助:

本文系国家自然科学基金项目"基于本体的专利自动标引研究"(项目编号: 61271304)、北京市教委科技发展计划重点项目暨北京市自然科学基金B类重点项目"面向领域的互联网多模态信息精准搜索方法研究"(项目编号:KZ201311232037)和北京市属高等学校创新团队建设与教师职业发展计划项目名称(项目编号: IDHT20130519)的研究成果之一.

通讯作者: 陈勇 E-mail: cy565025164@163.com     E-mail: cy565025164@163.com
作者简介: 作者贡献声明: 吕学强: 提出研究命题, 收集数据; 陈勇: 提出研究思路, 设计研究方案, 分析数据, 论文起草; 李红莲: 论文修订.
引用本文:   
陈勇, 李红莲, 吕学强. 网络用户搜索行为特征分析[J]. 现代图书情报技术, 2014, 30(12): 10-17.
Chen Yong, Li Honglian, Lv Xueqiang. Analysis for the Search Behavior of Web Users. New Technology of Library and Information Service, 2014, 30(12): 10-17.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2014.12.02      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2014/V30/I12/10

[1] Wu T, He H, Gu X, et al. An Intelligent Network User Behavior Analysis System Based on Collaborative Markov Model and Distributed Data Processing [C]. In: Proceedings of the 17th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Whistler, BC, Canada. IEEE, 2013: 221-228.
[2] Burke R. Hybrid Recommender Systems: Survey and Experiments [J]. User Modeling and User-Adapted Interaction, 2002, 12(4): 331-370.
[3] Silvestri F. Mining Query Logs: Turning Search Usage Data into Knowledge [J]. Foundations and Trends in Information Retrieval, 2010, 4(1-2): 1-174.
[4] Silverstein C, Henzinger M R, Marais H, et al. Analysis of a Very Large Web Search Engine Query Log [J]. ACM Special Interest Group on Information Retrieval (SIGIR), 1999, 33(1): 6-12.
[5] Eickhoff C, Teevan J, White R, et al. Lessons from the Journey: A Query Log Analysis of Within-session Learning [C]. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining. ACM, 2014: 223-232.
[6] Park M, Lee T. Understanding Science and Technology Information Users Through Transaction Log Analysis [J]. Library Hi Tech, 2013, 31(1): 123-140.
[7] Jiang S, Zilles S, Holte R. Query Suggestion by Query Search: A New Approach to User Support in Web Search[C]. In: Proceedings of the IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technologies, Milan, Italy. IET, 2009, 1: 679-684.
[8] Mei Q, Zhou D, Church K. Query Suggestion Using Hitting Time [C]. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management. ACM, 2008: 469-478.
[9] Downey D, Dumais S, Liebling D, et al. Understanding the Relationship Between Searchers' Queries and Information Goals[C]. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management. ACM, 2008: 449-458.
[10] 余慧佳, 刘奕群, 张敏, 等. 基于大规模日志分析的搜索引擎用户行为分析[J]. 中文信息学报, 2007, 21(1): 109-114. (Yu Huijia, Liu Yiqun, Zhang Min, et al. Research in Search Engine User Behavior Based on Analysis [J]. Journal of Chinese Information Processing , 2007, 21(1): 109-114.)
[11] 姚婷, 张敏, 刘奕群, 等. 低频查询的用户行为分析和类别研究[J]. 计算机研究与发展, 2012, 49(11): 2368-2375. (Yao Ting, Zhang Min, Liu Yiqun, et al. Empirical Study on Rare Query Categorization[J]. Journal of Computer Research and Development, 2012, 49(11): 2368-2375.)
[12] 万飞, 赵溪, 梁循, 等. 基于移动互联网日志的搜索引擎用户行为研究[J]. 中文信息学报, 2014, 28(2): 144-150. (Wan Fei, Zhao Xi, Liang Xun, et al. Search Behavior Study Based on the Mobile SearchLog [J]. Journal of Chinese Information Processing , 2014, 28(2): 144-150.)
[13] 张磊, 李亚楠, 王斌, 等. 网页搜索引擎查询日志的 Session 划分研究[J]. 中文信息学报, 2009, 23(2): 54-61. (Zhang Lei, Li Ya'nan, Wang Bin, et al. Session Segmentation Based on Query Logs of Web Search [J]. Journal of Chinese Information Processing, 2009, 23(2): 54-61.)
[14] 刘健, 刘奕群, 马少平, 等. 搜索引擎用户行为与用户满意度的关联研究[J]. 中文信息学报, 2014, 28(1): 73-79. (Liu Jian, Liu Yiqun, Ma Shaoping, et al. Analysis into the Relationship Between Research of Search Engine User Behavior and User Satisfaction Evaluation [J]. Journal of Chinese Information Processing, 2014, 28(1): 73-79.)
[15] 朱玲, 聂华. 通过日志挖掘研究图书馆资源发现服务用户的搜索行为[J]. 现代图书情报技术, 2011(12): 74-78. (Zhu Ling, Nie Hua. Research of User's Searching Behaviour of Library Resources Discovery Service by Log Mining [J]. New Technology of Library and Information Service, 2011(12): 74-78.)
[16] 董志安, 吕学强. 基于百度搜索日志的用户行为分析[J]. 计算机应用与软件, 2013, 30(7): 17-20. (Dong Zhian, Lv Xueqiang. Use Behaviour Analyses Based on Baidu Search Logs [J]. Computer Applications and Software, 2013, 30(7): 17-20.)
[17] 窦志成, 袁晓洁, 何松柏. 大规模中文搜索日志中查询重复性分析[J]. 计算机工程, 2008, 34(21): 40-41, 44. (Dou Zhicheng, Yuan Xiaojie, He Songbai. Analysis of Query Repetition in Large-scale Chinese Search Log [J]. Computer Engineering, 2008, 34(21): 40-41, 44.)
[18] 王倩, 刘奕群, 马少平, 等. 面向用户互联网访问日志的异常点击分析[J]. 中文信息学报, 2010, 24(3): 44-48, 61. (Wang Qian, Liu Yiqun, Ma Shaoping, et al. Abnormal Click Analysis in Web User Access Logs [J]. Journal of Chinese Information Processing, 2010, 24(3): 44-48, 61.)
[19] 赖茂生, 屈鹏. 搜索引擎查询日志的词性标注和挖掘研究[J]. 现代图书情报技术, 2009(4): 50-56. (Lai Maosheng, Qu Peng. The POS & Mining Study on Search Engine's Query Log [J]. New Technology of Library and Information Service, 2009(4): 50-56.)
[20] 岑荣伟, 刘奕群, 张敏, 等. 基于日志挖掘的搜索引擎用户行为分析[J]. 中文信息学报, 2010, 24(3): 49-54. (Cen Rongwei, Liu Yiqun, Zhang Min, et al. Search Engine User Behavior Analysis Based on Log Mining [J]. Journal of Chinese Information Processing, 2010, 24(3): 49-54.)
[21] 王浩, 姚长利, 郭琳, 等. 基于中文搜索引擎网络信息用户行为研究[J]. 计算机应用研究, 2009, 26(12): 4665-4668. (Wang Hao, Yao Changli, Guo Lin, et al. Research on Web User Behavior Based on Chinese Search Engine [J]. Application Research of Computers, 2009, 26(12): 4665-4668.)
[22] Jansen B J, Spink A, Bateman J, et al. Real Life Information Retrieval: A Study of User Queries on the Web[J]. ACM SIGIR Forum, 1998, 32(1): 5-17.
[23] Shannon C E. A Mathematical Theory of Communication[J]. SIGMOBILE Mobile Computing and Communications Review, 2001, 5(1): 3-55.
[24] 岑荣伟, 刘奕群, 张敏, 等. 网络检索用户行为可靠性分析[J]. 软件学报, 2010, 21(5): 1055-1066. (Cen Rongwei, Liu Yiqun, Zhang Min, et al. Reliability Analysis for the Behavior of Web Retrieval Users [J]. Journal of Software, 2010, 21(5): 1055-1066.)

[1] 安璐,梁艳平. 突发公共卫生事件微博话题与用户行为选择研究*[J]. 数据分析与知识发现, 2019, 3(4): 33-41.
[2] 席林娜,窦永香. 基于计划行为理论的微博用户转发行为影响因素研究*[J]. 数据分析与知识发现, 2019, 3(2): 13-20.
[3] 王欣瑞,何跃. 社交媒体用户交互行为与股票市场的关联分析研究: 基于新浪财经博客的实证[J]. 数据分析与知识发现, 2019, 3(11): 108-119.
[4] 周翔, 张鹏翼, 王军. 移动购物用户信息浏览特征及对购买的影响研究*——基于移动电商APP点击流日志的分析[J]. 数据分析与知识发现, 2018, 2(4): 1-9.
[5] 贾晓婷, 王名扬, 曹宇. 结合Doc2Vec与改进聚类算法的中文单文档自动摘要方法研究*[J]. 数据分析与知识发现, 2018, 2(2): 86-95.
[6] 王忠义, 张鹤铭, 黄京, 李春雅. 基于社会网络分析的网络问答社区知识传播研究[J]. 数据分析与知识发现, 2018, 2(11): 80-94.
[7] 陈远, 刘福珍, 吴江. 基于二模复杂网络的共享经济平台用户交互行为研究*[J]. 数据分析与知识发现, 2017, 1(6): 72-82.
[8] 夏立新, 杨金庆, 程秀峰. 基于情境感知技术的移动数据自动采集系统设计与实现*[J]. 数据分析与知识发现, 2017, 1(5): 82-93.
[9] 王曰芬,贾新露,傅柱. 学术社交网络用户内容使用行为研究*——基于科学网热门博文的实证分析[J]. 现代图书情报技术, 2016, 32(6): 63-72.
[10] 刘彤,倪维健,柳梅. 面向搜索引擎查询日志的领域术语自动识别方法*[J]. 现代图书情报技术, 2016, 32(2): 25-33.
[11] 童国平, 孙建军. 基于搜索日志的用户行为分析[J]. 现代图书情报技术, 2015, 31(7-8): 80-88.
[12] 黄文彬, 徐山川, 马龙, 王军. 利用通信数据的移动用户行为分析[J]. 现代图书情报技术, 2015, 31(5): 80-87.
[13] 陈和. 运用开源软件Logstash和ElasticSearch实现DSpace日志实时统计分析[J]. 现代图书情报技术, 2015, 31(5): 88-93.
[14] 王晰巍, 赵丹, 杨梦晴, 魏俊巍. 行业网站搜索引擎优化指标及实证研究——基于信息生态视角的分析[J]. 现代图书情报技术, 2015, 31(3): 75-83.
[15] 杨宁, 黄飞虎, 文奕, 陈云伟. 基于微博用户行为的观点传播模型[J]. 现代图书情报技术, 2015, 31(12): 34-41.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn