Please wait a minute...
New Technology of Library and Information Service  2010, Vol. 26 Issue (4): 24-34    DOI: 10.11925/infotech.1003-3513.2010.04.05
article Current Issue | Archive | Adv Search |
Research Review on XML Retrieval
Liu Dan1,Kong Shao-Hua1,Lu Wei2
1(Department of Information Management, Peking University, Beijing 100871, China)
2(Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

This paper reviews the research on XML retrieval from the prospect of information retrieval process. At first, the paper reviews separately on four parts:XML query language, XML indexing, XML retrieval ranking approaches, and evaluation of XML retrieval. Then, a number of XML retrieval hotspots are introduced. At last, it summaries and briefly describes some issues which need further study.

Key wordsXML retrieval        XML query language        XML indexing        XML retrieval models        XML retrieval evaluation        XML retrieval hotspots     
Received: 25 March 2010      Published: 25 April 2010
: 

G354

 
Corresponding Authors: Liu Dan     E-mail: liudan1987@gmail.com

Cite this article:

Liu Dan,Kong Shao-Hua,Lu Wei. Research Review on XML Retrieval. New Technology of Library and Information Service, 2010, 26(4): 24-34.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2010.04.05     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2010/V26/I4/24

[1] Lalmas M, Trotman A. XML Retrieval[R]. Berlin:Springer, 2009.
[2] Mannning C D, Raghavan P, Schutze H. Introduction to Information Retrieval[EB/OL].[2010-02-01]. http://www-csli.stanford.edu/~hinrich/information-retrieval-book.html.
[3] Guo L,Shao F, Botev C, et al.XRANK:Ranked Keyword Search over XML Documents[C]. In:Proceedings of the 22nd ACM International Conference on Management of Data.2003:16-27.
[4] Xu Y,Papakonstantinou Y. Efficient Keyword Search for Smallest LCAs in XML Databases[C]. In:Proceedings of the 24th ACM International Conference on Management of Data,Baltimore, Maryland. New York, NY, USA :ACM,   2005:527-538.
[5] Trotman A, Sigurbjrnsson B. Narrowed Extended XPath I (NEXI)[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:16-40.
[6] Amer-Yahia S, Lalmas M. XML Search:Languages, INEX and Scoring[C]. In:Proceedings of  the 25th ACM International Conference on Management of Data.2006:16-23.
[7] Cohen S, Mamou J, Kanza Y, et al. XSEarch:A Semantic Search Engine for XML[C]. In:Proceedings of the 29th ACM International Conference on Very Large Data Bases. 2003:45-56.
[8] Fuhr N, Gro-johann K.XIRQL:A Query Language for Information Retrieval in XML Documents[C]. In:Proceedings of the 24th Annual International ACM SIGIR Conference.2001:172-180.
[9] Theobald A, Weikum G. The Index-based XXL Search Engine for Querying XML Data with Relevance Ranking[C]. In:Proceedings of the 8th International Conference on Extending Database Technology. 2002:311-340.
[10] Amer-Yahia S, Lakshmanan L, Pandit S. FleXPath:Flexible Structure and Full-text Querying for XML[C]. In:Proceedings of the 23rd ACM International Conference on Management of Data.2004:83-94.
[11] W3C. XQuery 1.0:An XML Query Language[EB/OL].[2010-02-01].http://www.w3.org/TR/xquery/.
[12] W3C. XQuery and XPath Full Text 1.0[EB/OL].[2010-02-01].http://www.w3.org/TR/xpath-full-text-10/.
[13] Mass Y, Mandelbrod M. Retrieving the Most Relevant XML Components[C]. In:Proceedings of the 2nd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2004:53-58.
[14] Sigurbjrnsson B, Kamps J, Rijke M. The Effect of Structured Queries and Selective Indexing on XML Re-trieval[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:104-118.
[15] Liu J, Lin H, Han B. Study on Reranking XML Retrieval Elements Based on Combining Strategy and Topics Categorization[C]. In:Proceedings of the 6th Initiative on the Evaluation of XML Retrieval Workshop.2007:170-176.
[16] Sauvagnat K, Hlaoua L, Boughanem M.  XFIRM at INEX 2005:Ad-Hoc and Relevance Feedback Tracks[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:88-103.
[17] Luk R, Leong H, Dillon T S,et al. A Survey in Indexing and Searching XML Documents[J]. Journal of the American Society for Information Science and Technology, 2002,53(6):415-437.
[18] 孔令波,唐世渭,杨冬青,等. XML数据索引技术[J]. 软件学报, 2005,16(12):2063-2079.
[19] Gou G, Chirkova R. Efficiently Querying Large XML Data Repositories:A Survey[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(10):1381-1403.
[20] Hiemstra D. A Database Approach to Content-based XML Retrieval[C]. In:Proceedings of the 1st Initiative on the Evaluation of XML Retrieval Workshop. 2003:111-118.
[21] Geva S. GPX–Gardens Point XML Information Retrieval at INEX 2004[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:211-223.
[22] Lee J, Grossman D, Frieder O,et al. Integrating Structured Data and Text:A Multi-dimensional Approach[C]. In:Proceedings of the International Conference on Information Technology:Coding and Computing. 2000:264-269.
[23] Mass Y, Mandelbrod M. Component Ranking and Automatic Query Refinement for XML Retrieval[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:73-84.
[24] Mass Y, Mandelbrod M. Using the INEX Environment as a Test Bed for Various User Models for XML Retrieval[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:187-195.
[25] Lu W, Robertson S, Macfarlane A. Field-Weighted XML Retrieval Based on BM25[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:126-137.
[26] Lu W, Robertson S, MacFarlane A. CISR at INEX 2006[C]. In:Proceedings of the 5th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2007:57-63.
[27] 陆伟, Robertson S. 基于域加权词频法的XML文档级检索实现与评价[J]. 中国图书馆学报,2006,32(6):57-60.
[28] 陆伟. 元素级XML检索模型构建的关键问题与解决方案研究[J].中国图书馆学报,2007,33(6):58-61.
[29] Sigurbjrnsson B, Kamps J, Rijke M. An Element-based Approach to XML Retrieval[C]. In:Proceedings of the 2nd Initiative on the Evaluation of XML Retrieval Workshop.2004:19-26.
[30] Sigurbjrnsson B, Kamps J, Rijke M. Mixture Models, Overlap, and Structural Hints in XML Element Retrieval[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:196-210.
[31] Ogilvie P, Callan J. Language Models and Structured Document Retrieval[C]. In:Proceedings of the 1st Initiative on the Evaluation of XML Retrieval Workshop. 2003:33-40.
[32] Ogilvie P, Callan J. Using Language Models for Flat Text Queries in XML Retrieval[C]. In:Proceedings of the 2nd Initiative on the Evaluation of XML Retrieval  Workshop. 2004.
[33] Ogilvie P, Callan J. Hierarchical Language Models for XML Component Retrieval[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:224-237.
[34] Huang W, Trotman A, O’Keefe R A. Element Retrieval Using a Passage Retrieval Approach[C]. In:Proceedings of the 11th Australasian Document Computing Symposium. 2006:80-83.
[35] Shimizu T, Yoshikawa M. A Ranking Scheme for XML Information Retrieval Based on Benefit and Reading Effort[C]. In:Proceedings of the 10th International Conference on Asian Digital Libraries. 2007:230-240.
[36] Geva S. GPX – Gardens Point XML Information Retrieval at INEX 2004[C]. In:Proceedings of the 3rd Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2005:211-223.
[37] Geva S. GPX – Gardens Point XML IR at INEX 2005[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:240-253.
[38] Geva S. GPX – Gardens Point XML IR at INEX 2006[C]. In:Proceedings of the 5th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2007:137-150.
[39] Geva S. GPX:Ad-Hoc Queries and Automated Link Discovery in the Wikipedia[C]. In:Proceedings of the 6th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2008:404-416.
[40] Clarke C.Controlling Overla Pin Content-Oriented XML Retrieval[C]. In:Proceedings of the 28th Annual International ACM SIGIR Conference. 2005:314-321.
[41] Mihajlovic V, Ramirez G, Westerveld T,et al. TIJAH Scratches INEX 2005:Vague Element Selection, Image Search, Overlap, and Relevance Feedback[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:72-87.
[42] Malik S, Kazai G, Lalmas M, et al. Overview of INEX 2005[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:1-15.
[43] Trotman A, Lalmas M. The Interpretation of CAS[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:58-71.
[44] Theobald M, Schenkel R, Weikum G. TopX & XXL at INEX 2005[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:282-295.
[45] Oglive P, Lalmas M. Investigating the Exhaustivity Dimension in Content-Oriented XML Element Retrieval Evaluation[C].In:Proceedings of the 15th ACM International Conference on Information and Knowledge.2006:84-93.
[46] Kazai G, Lalmas M. INEX 2005 Evaluation Measures[C]. In:Proceedings of the 4th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2006:16-29.
[47] Lalmas M, Kazai G, Kamps J,et al. INEX 2006 Evaluation Measures[C]. In:Proceedings of the 5th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2007:20-34.
[48] Kamps J, Pehcevski J, Kazai G,et al. INEX 2007 Evaluation Measures[C]. In:Proceedings of the 6th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2008:24-33.
[49] Kamps J, Geva S, Trotman A,et al. Overview of the INEX 2008 Ad-Hoc Track[C]. In:Proceedings of the 7th Initiative on the Evaluation of XML Retrieval Workshop. Berlin:Springer, 2009:1-28.
[50] INEX 2009[EB/OL].[2010-02-01]. http://www.inex.otago.ac.nz/.
[51] Govert N, Kazai G. Overview of the Initiative for the Evaluation of XML Retrieval (INEX) 2002[C]. In:Proceedings of the 1st Initiative on the Evaluation of XML Retrieval Workshop. 2003:1-18.
[52] Fuhr N, Malik S, Lalmas M. Overview of the Initiative for the Evaluation of XML Retrieval (INEX) 2003[C]. In:Proceedings of the 2nd Initiative on the Evaluation of XML Retrieval Workshop. 2004:15-17.

[1] Liu Dan. Design and Implementation of Chinese Thesis Retrieval System Based on XML[J]. 现代图书情报技术, 2010, 26(5): 50-57.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn