Please wait a minute...
Advanced Search
现代图书情报技术  2010, Vol. 26 Issue (4): 24-34    DOI: 10.11925/infotech.1003-3513.2010.04.05
  知识组织与知识管理 本期目录 | 过刊浏览 | 高级检索 |
XML检索研究综述
刘丹1,孔少华1,陆伟2
1( 北京大学信息管理系)
2( 武汉大学信息资源研究中心)
Research Review on XML Retrieval
Liu Dan1,Kong Shao-Hua1,Lu Wei2
1(Department of Information Management, Peking University, Beijing 100871, China)
2(Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China)
全文: PDF(576 KB)   HTML  
输出: BibTeX | EndNote (RIS)      
摘要 

从信息检索流程对XML检索的研究情况进行综述。主要对XML查询语言、XML索引、XML检索排序方法以及XML检索评价4个方面的研究情况进行评述,并对XML检索研究的一些热点领域进行介绍,最后就需要继续深入研究的问题进行简要说明。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
刘丹
孔少华
陆伟
关键词  XML检索  XML查询语言  XML索引  XML检索模型   XML检索评价   XML检索研究热点    
Abstract

This paper reviews the research on XML retrieval from the prospect of information retrieval process. At first, the paper reviews separately on four parts:XML query language, XML indexing, XML retrieval ranking approaches, and evaluation of XML retrieval. Then, a number of XML retrieval hotspots are introduced. At last, it summaries and briefly describes some issues which need further study.

Key wordsXML retrieval      XML query language      XML indexing      XML retrieval models      XML retrieval evaluation      XML retrieval hotspots
收稿日期: 2010-03-25     
: 

G354

 
通讯作者: 刘丹     E-mail: liudan1987@gmail.com
引用本文:   
刘丹 孔少华 陆伟. XML检索研究综述[J]. 现代图书情报技术, 2010, 26(4): 24-34.
Liu Dan,Kong Shao-Hua,Lu Wei. Research Review on XML Retrieval. New Technology of Library and Information Service, DOI:10.11925/infotech.1003-3513.2010.04.05.
链接本文:  
http://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.1003-3513.2010.04.05

[1] Lalmas M, Trotman A. XML Retrieval[R]. Berlin:Springer, 2009.
[2] Mannning C D, Raghavan P, Schutze H. Introduction to Information Retrieval[EB/OL].[2010-02-01]. http://www-csli.stanford.edu/~hinrich/information-retrieval-book.html.
[3] Guo L,Shao F, Botev C, et al.XRANK:Ranked Keyword Search over XML Documents[C]. In:Proceedings of the 22nd ACM International Conference on Management of Data.2003:16-27.
[4] Xu Y,Papakonstantinou Y. Efficient Keyword Search for Smallest LCAs in XML Databases[C]. In:Proceedings of the 24th ACM International Conference on Management of Data,Baltimore, Maryland. New York, NY, USA :ACM,   2005:527-538.
[5] Trotman A, Sigurbjrnsson B. Narrowed Extended XPath I (NEXI)[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:16-40.
[6] Amer-Yahia S, Lalmas M. XML Search:Languages, INEX and Scoring[C]. In:Proceedings of  the 25th ACM International Conference on Management of Data.2006:16-23.
[7] Cohen S, Mamou J, Kanza Y, et al. XSEarch:A Semantic Search Engine for XML[C]. In:Proceedings of the 29th ACM International Conference on Very Large Data Bases. 2003:45-56.
[8] Fuhr N, Gro-johann K.XIRQL:A Query Language for Information Retrieval in XML Documents[C]. In:Proceedings of the 24th Annual International ACM SIGIR Conference.2001:172-180.
[9] Theobald A, Weikum G. The Index-based XXL Search Engine for Querying XML Data with Relevance Ranking[C]. In:Proceedings of the 8th International Conference on Extending Database Technology. 2002:311-340.
[10] Amer-Yahia S, Lakshmanan L, Pandit S. FleXPath:Flexible Structure and Full-text Querying for XML[C]. In:Proceedings of the 23rd ACM International Conference on Management of Data.2004:83-94.
[11] W3C. XQuery 1.0:An XML Query Language[EB/OL].[2010-02-01].http://www.w3.org/TR/xquery/.
[12] W3C. XQuery and XPath Full Text 1.0[EB/OL].[2010-02-01].http://www.w3.org/TR/xpath-full-text-10/.
[13] Mass Y, Mandelbrod M. Retrieving the Most Relevant XML Components[C]. In:Proceedings of the 2nd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2004:53-58.
[14] Sigurbjrnsson B, Kamps J, Rijke M. The Effect of Structured Queries and Selective Indexing on XML Re-trieval[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:104-118.
[15] Liu J, Lin H, Han B. Study on Reranking XML Retrieval Elements Based on Combining Strategy and Topics Categorization[C]. In:Proceedings of the 6th Initiative on the Evaluation of XML Retrieval Workshop.2007:170-176.
[16] Sauvagnat K, Hlaoua L, Boughanem M.  XFIRM at INEX 2005:Ad-Hoc and Relevance Feedback Tracks[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:88-103.
[17] Luk R, Leong H, Dillon T S,et al. A Survey in Indexing and Searching XML Documents[J]. Journal of the American Society for Information Science and Technology, 2002,53(6):415-437.
[18] 孔令波,唐世渭,杨冬青,等. XML数据索引技术[J]. 软件学报, 2005,16(12):2063-2079.
[19] Gou G, Chirkova R. Efficiently Querying Large XML Data Repositories:A Survey[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(10):1381-1403.
[20] Hiemstra D. A Database Approach to Content-based XML Retrieval[C]. In:Proceedings of the 1st Initiative on the Evaluation of XML Retrieval Workshop. 2003:111-118.
[21] Geva S. GPX–Gardens Point XML Information Retrieval at INEX 2004[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:211-223.
[22] Lee J, Grossman D, Frieder O,et al. Integrating Structured Data and Text:A Multi-dimensional Approach[C]. In:Proceedings of the International Conference on Information Technology:Coding and Computing. 2000:264-269.
[23] Mass Y, Mandelbrod M. Component Ranking and Automatic Query Refinement for XML Retrieval[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:73-84.
[24] Mass Y, Mandelbrod M. Using the INEX Environment as a Test Bed for Various User Models for XML Retrieval[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:187-195.
[25] Lu W, Robertson S, Macfarlane A. Field-Weighted XML Retrieval Based on BM25[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:126-137.
[26] Lu W, Robertson S, MacFarlane A. CISR at INEX 2006[C]. In:Proceedings of the 5th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2007:57-63.
[27] 陆伟, Robertson S. 基于域加权词频法的XML文档级检索实现与评价[J]. 中国图书馆学报,2006,32(6):57-60.
[28] 陆伟. 元素级XML检索模型构建的关键问题与解决方案研究[J].中国图书馆学报,2007,33(6):58-61.
[29] Sigurbjrnsson B, Kamps J, Rijke M. An Element-based Approach to XML Retrieval[C]. In:Proceedings of the 2nd Initiative on the Evaluation of XML Retrieval Workshop.2004:19-26.
[30] Sigurbjrnsson B, Kamps J, Rijke M. Mixture Models, Overlap, and Structural Hints in XML Element Retrieval[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:196-210.
[31] Ogilvie P, Callan J. Language Models and Structured Document Retrieval[C]. In:Proceedings of the 1st Initiative on the Evaluation of XML Retrieval Workshop. 2003:33-40.
[32] Ogilvie P, Callan J. Using Language Models for Flat Text Queries in XML Retrieval[C]. In:Proceedings of the 2nd Initiative on the Evaluation of XML Retrieval  Workshop. 2004.
[33] Ogilvie P, Callan J. Hierarchical Language Models for XML Component Retrieval[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:224-237.
[34] Huang W, Trotman A, O’Keefe R A. Element Retrieval Using a Passage Retrieval Approach[C]. In:Proceedings of the 11th Australasian Document Computing Symposium. 2006:80-83.
[35] Shimizu T, Yoshikawa M. A Ranking Scheme for XML Information Retrieval Based on Benefit and Reading Effort[C]. In:Proceedings of the 10th International Conference on Asian Digital Libraries. 2007:230-240.
[36] Geva S. GPX – Gardens Point XML Information Retrieval at INEX 2004[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:211-223.
[37] Geva S. GPX – Gardens Point XML IR at INEX 2005[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:240-253.
[38] Geva S. GPX – Gardens Point XML IR at INEX 2006[C]. In:Proceedings of the 5th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2007:137-150.
[39] Geva S. GPX:Ad-Hoc Queries and Automated Link Discovery in the Wikipedia[C]. In:Proceedings of the 6th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2008:404-416.
[40] Clarke C.Controlling Overla Pin Content-Oriented XML Retrieval[C]. In:Proceedings of the 28th Annual International ACM SIGIR Conference. 2005:314-321.
[41] Mihajlovic V, Ramirez G, Westerveld T,et al. TIJAH Scratches INEX 2005:Vague Element Selection, Image Search, Overlap, and Relevance Feedback[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:72-87.
[42] Malik S, Kazai G, Lalmas M, et al. Overview of INEX 2005[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:1-15.
[43] Trotman A, Lalmas M. The Interpretation of CAS[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:58-71.
[44] Theobald M, Schenkel R, Weikum G. TopX & XXL at INEX 2005[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:282-295.
[45] Oglive P, Lalmas M. Investigating the Exhaustivity Dimension in Content-Oriented XML Element Retrieval Evaluation[C].In:Proceedings of the 15th ACM International Conference on Information and Knowledge.2006:84-93.
[46] Kazai G, Lalmas M. INEX 2005 Evaluation Measures[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:16-29.
[47] Lalmas M, Kazai G, Kamps J,et al. INEX 2006 Evaluation Measures[C]. In:Proceedings of the 5th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2007:20-34.
[48] Kamps J, Pehcevski J, Kazai G,et al. INEX 2007 Evaluation Measures[C]. In:Proceedings of the 6th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2008:24-33.
[49] Kamps J, Geva S, Trotman A,et al. Overview of the INEX 2008 Ad-Hoc Track[C]. In:Proceedings of the 7th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2009:1-28.
[50] INEX 2009[EB/OL].[2010-02-01]. http://www.inex.otago.ac.nz/.
[51] Govert N, Kazai G. Overview of the Initiative for the Evaluation of XML Retrieval (INEX) 2002[C]. In:Proceedings of the 1st Initiative on the Evaluation of XML Retrieval Workshop. 2003:1-18.
[52] Fuhr N, Malik S, Lalmas M. Overview of the Initiative for the Evaluation of XML Retrieval (INEX) 2003[C]. In:Proceedings of the 2nd Initiative on the Evaluation of XML Retrieval Workshop. 2004:15-17.

[1] 唐晓波, 房小可. 融入社会关系的微博排名策略研究[J]. 现代图书情报技术, 2013, 29(9): 74-81.
[2] 邱瑾, 吴丹. 用户协同信息检索行为与系统评价研究——以任务类型和协同能力为视角[J]. 现代图书情报技术, 2012, (9): 62-68.
[3] 杨美姣, 窦永香. 基于社区的P2P信息检索研究[J]. 现代图书情报技术, 2011, 27(9): 14-20.
[4] 程秀峰, 祝颂, 夏立新. 基于分布式的直方图检索方法研究及实现[J]. 现代图书情报技术, 2011, 27(5): 42-48.
[5] 姚占雷, 许鑫. 互联网新闻报道中的突发事件识别研究[J]. 现代图书情报技术, 2011, 27(4): 52-57.
[6] 王昊, 苏新宁. 基于CSSCI本体的知识检索服务平台构建及应用[J]. 现代图书情报技术, 2011, 27(3): 22-29.
[7] 何继媛, 窦永香, 刘东苏. 大众标注系统中基于本体的语义检索研究综述[J]. 现代图书情报技术, 2011, 27(3): 51-56.
[8] 周之诚. 基于查询意图聚类的实时搜索建议[J]. 现代图书情报技术, 2011, 27(2): 87-93.
[9] 郑菲, 陈朝晖, 文奕, 胡正银, 任波, 肖仙桃, 李珑. 中国科学院科技查新检索服务平台的设计与实践应用[J]. 现代图书情报技术, 2010, 26(11): 79-83.
[10] 许鑫 黄仲清 邓三鸿. 互联网侨情信息采集系统设计与实现*[J]. 现代图书情报技术, 2010, 26(7/8): 95-101.
[11] 曾子明,张李义. 一种基于语义相似度和多属性决策方法的商品信息智能检索模型*[J]. 现代图书情报技术, 2010, 26(1): 22-27.
[12] 张李义,张震云. 一种新的跨语言商品信息检索方法在图书搜索中的应用*[J]. 现代图书情报技术, 2010, 26(1): 9-14.
[13] 何琳,张振贵,黄水清. 基于Lucene的OA资源全文检索系统的设计与实现*[J]. 现代图书情报技术, 2009, 25(11): 44-48.
[14] 张露,成颖. 信息检索中的语境研究综述*[J]. 现代图书情报技术, 2009, (10): 14-21.
[15] 贾君枝,卫荣娟,罗林强. 《汉语主题词表》XML文档的自动生成研究[J]. 现代图书情报技术, 2009, 25(5): 50-54.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn