|
|
Cross Language Information Retrieval Model Based on Matrix-weighted Association Patterns Mining |
Huang Mingxuan( ) |
Guangxi Key Laboratory Cultivation Base of Cross-border E-commerce Intelligent Information Processing, Guangxi University of Finance and Economics, Nanning 530003, China Department of Computer Science, Guangxi University of Finance and Economics, Nanning 530003, China |
|
|
Abstract [Objective]The purpose of this paper is to solve the query drift issue facing cross language information retrieval. It proposes a new model to retrieve Chinese documents with Indonesian queries. [Methods] The new model integrated the algorithms of matrix-weighted association patterns mining, query expansion, as well as user click-download behaviors. [Results] The R_prec, p@10 and p@20 values of the proposed model were higher than the 60% benchmark of the monolingual retrieval on the CLIR NTCIR-5 data set. These results were 37% higher than cross language retrieval baseline and 28% higher than the existing algorithms based on pseudo relevance feedback. [Limitations] The proposed model was only examined in the cross language retrieval system built with the vector space model, which needs to be done with the real world search engines. [Conclusions] The proposed model could effectively reduce query drift in cross language retrieval, and retrieve more relevant Chinese documents with Indonesian long queries.
|
Received: 18 September 2016
Published: 22 February 2017
|
|
[1] |
Gao J F, Nie J Y, Zhang J, et al.TREC-9 CLIR Experiments at MSRCN[C]//Proceedings of the 9th Text Retrieval Evaluation Conference. 2001.
|
[2] |
吴丹, 何大庆, 王惠临. 基于伪相关反馈的跨语言查询扩展[J]. 情报学报, 2010, 29(2): 232-239.
doi: 10.3772/j.issn.1000-0135.2010.02.006
|
[2] |
(Wu Dan, He Daqing, Wang Huilin.Cross-Language Query Expansion Using Pseudo Relevance Feedback[J]. Journal of the China Society for Scientific and Technical Information, 2010, 29(2): 232-239. )
doi: 10.3772/j.issn.1000-0135.2010.02.006
|
[3] |
吴丹, 何大庆, 王惠临. 一种基于相关反馈的跨语言信息检索查询翻译优化技木研究[J]. 情报学报, 2012, 31(4): 398-406.
doi: 10.3772/j.issn.1000-0135.2012.04.008
|
[3] |
(Wu Dan, He Daqing, Wang Huilin.A Relevance Feedback Based Query Translation Enhancement Technique in Cross Language Information Retrieval[J]. Journal of the China Society for Scientific and Technical Information, 2012, 31(4): 398-406.)
doi: 10.3772/j.issn.1000-0135.2012.04.008
|
[4] |
Chinnakotla M K, Raman K, Bhattacharyya P.Multilingual Pseudo-relevance Feedback: Performance Study of Assisting Languages[C]//Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2010: 1346-1356.
|
[5] |
Parton K, Gao J.Combining Signals for Cross-Lingual Relevance Feedback[C]//Proceedings of the 8th Asia Information Retrieval Societies Conference (AIRS 2012), Tianjin, China. Springer Berlin Heidelberg. 2012.
|
[6] |
Lee C J, Croft W B.Cross-Language Pseudo-Relevance Feedback Techniques for Informal Text [C]//Proceedings of the 36th European Conference on IR Research (ECIR 2014), Amsterdam, The Netherlands. Springer International Publishing, 2014.
|
[7] |
闭剑婷, 苏一丹. 基于潜在语义分析的跨语言查询扩展方法[J]. 计算机工程, 2009, 35(10): 49-50.
|
[7] |
(Bi Jianting, Su Yidan.Expansion Method for Language-crossed Query Based on Latent Semantic Analysis[J]. Computer Engineering, 2009, 35(10): 49-50.)
|
[8] |
魏露, 李书琴, 李伟男, 等. 跨语言查询扩展优化[J]. 计算机工程与设计, 2014, 35(8): 2785-2788, 2803.
|
[8] |
(Wei Lu, Li Shuqin, Li Weinan, et al.Optimization of Cross-language Query Expansion[J]. Computer Engineering and Design, 2014, 35(8): 2785-2803.)
|
[9] |
宁健, 林鸿飞. 基于改进潜在语义分析的跨语言检索[J]. 中文信息学报, 2010, 24(3): 105-111.
|
[9] |
(Ning Jian, Lin Hongfei.Cross-Language Information Retrieval Based on Improved Latent Semantic Indexing[J]. Journal of Chinese Information Processing, 2010, 24(3): 105-111.)
|
[10] |
罗远胜, 王明文, 勒中坚, 等. 跨语言信息检索中的双语主题相关模型[J]. 小型微型计算机系统, 2013, 34(12): 2758-2763.
|
[10] |
(Luo Yuansheng, Wang Mingwen, Le Zhongjian, et al.Bilingual Topic Correlation Model in Cross-lingual Information Retrieval[J]. Journal of Chinese Computer Systems, 2013, 34(12): 2758-2763.)
|
[11] |
Rahimi R, Shakery A, King I.Multilingual Information Retrieval in the Language Modeling Framework[J]. Information Retrieval Journal, 2015, 18(3): 246-281.
|
[12] |
Ganguly D, Leveling J, Jones G J F. Cross-lingual Topical Relevance Models[C]//Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012). 2012.
|
[13] |
Wang X W, Zhang Q, Wang X J, et al.LDA Based PSEUDO Relevance Feedback for Cross Language Information Retrieval[C]//Proceedings of the 2nd International Conference on Cloud Computing and Intelligence Systems. IEEE, 2012.
|
[14] |
Wang X W, Wang X J, Zhang Q, et al.A Web-Based CLIR System with Cross-Lingual Topical Pseudo Relevance Feedback[C] // Proceedings of the 4th International Conference on Conference and Labs of the Evaluation Forum (CLEF) Initiative, Valencia, Spain. 2013.
|
[15] |
王序文, 王小捷, 孙月萍. 双语主题跨语言伪相关反馈[J]. 北京邮电大学学报, 2013, 36(4): 81-84.
doi: 10.13190/jbupt.201304.81.wangxw
|
[15] |
(Wang Xuwen, Wang Xiaojie, Sun Yueping.Cross-lingual Pseudo Relevance Feedback Based on Bilingual Topics[J]. Journal of Beijing University of Posts and Telecommunications, 2013, 36(4): 81-84.)
doi: 10.13190/jbupt.201304.81.wangxw
|
[16] |
Wang X W, Zhang Q, Wang X J, et al.Cross-lingual Pseudo Relevance Feedback Based on Weak Relevant Topic Alignment[C]//Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation Shanghai, China. 2015: 529-534.
|
[17] |
黄名选, 严小卫, 张师超. 基于矩阵加权关联规则挖掘的伪相关反馈查询扩展[J]. 软件学报, 2009, 20(7): 1854-1865.
doi: 10.3724/SP.J.1001.2009.03368
|
[17] |
(Huang Mingxuan, Yan Xiaowei, Zhang Shichao.Query Expansion of Pseudo Relevance Feedback Based on Matrix-Weighted Association Rules Mining[J]. Journal of Software, 2009, 20(7): 1854-1865.)
doi: 10.3724/SP.J.1001.2009.03368
|
[18] |
Agrawal R, Imielinski T, Swami A.Mining Association Rules Between Sets of Items in Large Database[C]//Proceedings of 1993 ACM SIGMOD International Conference on Management of Data. 1993.
|
[19] |
Salton G, Buckley C.Term-weighting Approaches in Automatic Text Retrieval[J]. Information Processing & Management, 1988, 24(5): 513-523.
doi: 10.1016/0306-4573(88)90021-0
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|