|
|
Subspace Cross-modal Retrieval Based on High-Order Semantic Correlation |
Zhu Lu,Tian Xiaomeng( ),Cao Sainan,Liu Yuanyuan |
School of Information Engineering, East China Jiaotong University, Nanchang 330000, China |
|
|
Abstract [Objective] This paper converts the heterogeneous multi-modal data into isomorphism, aiming to address the semantic gaps and improve the accuracy of cross-modal retrieval.[Methods] First, we decided the high-order semantic correlation between multi-modal data. Then, we combined the annotation and the structure information of multi-modal data. Finally, we transformed the data of different modals into isomorphism for direct retrieval.[Results] We examined our method with three open datasets of WIKI, NUS-WIDE and XMedia. The average MAP value obtained by our method was 0.111 3, 0.091 0 and 0.185 0 higher than the best results of CCA, JGRHML, SCM and JFSSL.[Limitations] Our method is not applicable to semi-supervised and unsupervised data.[Conclusions] The proposed method improves the accuracy of cross-modal retrieval effectively.
|
Received: 05 August 2019
Published: 15 June 2020
|
|
Corresponding Authors:
Tian Xiaomeng
E-mail: tianxiaomeng2016@126.com
|
[1] |
Rasiwasia N, Pereira J C, Coviello E , et al. A New Approach to Cross-modal Multimedia Retrieval [C]// Proceedings of the ACM International Conference on Multimedia. ACM, 2010: 251-260.
|
[2] |
Peng Y, Huang X, Zhao Y. An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2018,28(9):2372-2385.
|
[3] |
Hardoon D R, Szedmak S, Shawe-Taylor J. Canonical Correlation Analysis: An Overview with Application to Learning Methods[J]. Neural Computation, 2004,16(12):2639-2664.
|
[4] |
Wei X, Croft W B . LDA-Based Document Models for Ad-Hoc Retrieval [C]// Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2006: 178-185.
|
[5] |
Zheng W, Zhou X, Zou C, et al. Facial Expression Recognition Using Kernel Canonical Correlation Analysis (KCCA )[J]. IEEE Transactions on Neural Networks, 2006,17(1):233-238.
|
[6] |
李广丽, 刘斌, 朱涛, 等. 基于优选典型相关分量的跨媒体检索模型[J]. 山东大学学报: 工学版, 2018,48(5):42-50.
|
[6] |
( Li Guangli, Liu Bin, Zhu Tao, et al. Cross-media Retrieval Model Based on Choosing Key Canonical Correlated Vectors[J]. Journal of Shandong University: Engineering Science, 2018,48(5):42-50.)
|
[7] |
Pereira J C, Coviello E, Doyle G, et al. On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014,36(3):521-535.
doi: 10.1109/TPAMI.2013.142
|
[8] |
Zhai X, Peng Y, Xiao J . Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval [C]// Proceedings of the 27th AAAI Conference on Artificial Intelligence. AAAI, 2013.
|
[9] |
丁恒, 陆伟. 基于相关性的跨模态信息检索研究[J]. 现代图书情报技术, 2016(1):17-23.
|
[9] |
( Ding Heng, Lu Wei. A Study on Correlation-based Cross-Modal Information Retrieval[J]. New Technology of Library and Information Service, 2016(1):17-23.)
|
[10] |
Zhai X, Peng Y, Xiao J. Learning Cross-Media Joint Representation with Sparse and Semisupervised Regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014,24(6):965-978.
doi: 10.1109/TCSVT.2013.2276704
|
[11] |
Wang K, He R, Wang W, et al. Joint Feature Selection and Subspace Learning for Cross-modal Retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016,38(10):2010-2023.
|
[12] |
代刚, 张鸿. 基于语义相关性与拓扑关系的跨媒体检索算法[J]. 计算机应用, 2018,38(9):2529-2534.
|
[12] |
( Dai Gang, Zhang Hong. Cross-media Retrieval Algorithm Based on Semantic Correlation and Topological Relationship[J]. Application Research of Computers, 2018,38(9):2529-2534.)
|
[13] |
Peng Y, Zhai X, Zhao Y, et al. Semi-Supervised Cross-Media Feature Learning with Unified Patch Graph Regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016,26(3):583-596.
|
[14] |
卓昀侃, 綦金玮, 彭宇新. 跨媒体深层细粒度关联学习方法[J]. 软件学报, 2019,30(4):884-895.
|
[14] |
( Zhuo Yunkan, Qi Jinwei, Peng Yuxin. Cross-media Deep Fine-grained Correlation Learning[J]. Journal of Software, 2019,30(4):884-895.)
|
[15] |
Deng C, Tang X, Yan Y, et al. Discriminative Dictionary Learning with Common Label Alignment for Cross-Modal Retrieval[J]. IEEE Transactions on Multimedia, 2016,18(2):208-218.
|
[16] |
Zhang L, Ma B, Li G, et al. Cross-modal Retrieval Using Multi-ordered Discriminative Structured Subspace Learning[J]. IEEE Transactions on Multimedia, 2017,19(6):1220-1233.
|
[17] |
Zhang L, Ma B, Li G, et al. Generalized Semi-supervised and Structured Subspace Learning for Cross-modal Retrieval[J]. IEEE Transactions on Multimedia, 2017,20(1):128-141.
|
[18] |
He R, Tan T, Wang L , et al. L(2,1) Regularized Correntropy for Robust Feature Selection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.
|
[19] |
Yang Y, Shen H, Ma Z . L2,1-norm Regularized Discriminative Feature Selection for Unsupervised Learning [C]// Proceedings of the 22nd International Joint Conference on Artificial Intelligence. AAAI, 2011.
|
[20] |
Nikolova M, Ng M K. Analysis of Half-Quadratic Minimization Methods for Signal and Image Recovery[J]. SIAM Journal on Scientific Computing, 2005,27(3):937-966.
|
[21] |
张振亚, 王进, 程红梅, 等. 基于余弦相似度的文本空间索引方法研究[J]. 计算机科学, 2005,32(9):160-163.
|
[21] |
( Zhang Zhenya, Wang Jin, Cheng Hongmei, et al. An Approach for Spatial Index for Text Information Based on Cosine Similarity[J]. Computer Science, 2005,32(9):160-163.)
|
[22] |
Chua T S, Tang J, Hong R , et al. NUS-WIDE: A Real-world Web Image Database from National University of Singapore [C]// Proceedings of the ACM International Conference on Image and Video Retrieval. ACM, 2009.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|