Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (5): 84-91    DOI: 10.11925/infotech.2096-3467.2019.0912
Current Issue | Archive | Adv Search |
Subspace Cross-modal Retrieval Based on High-Order Semantic Correlation
Zhu Lu,Tian Xiaomeng(),Cao Sainan,Liu Yuanyuan
School of Information Engineering, East China Jiaotong University, Nanchang 330000, China
Download: PDF (1210 KB)   HTML ( 7
Export: BibTeX | EndNote (RIS)      

[Objective] This paper converts the heterogeneous multi-modal data into isomorphism, aiming to address the semantic gaps and improve the accuracy of cross-modal retrieval.[Methods] First, we decided the high-order semantic correlation between multi-modal data. Then, we combined the annotation and the structure information of multi-modal data. Finally, we transformed the data of different modals into isomorphism for direct retrieval.[Results] We examined our method with three open datasets of WIKI, NUS-WIDE and XMedia. The average MAP value obtained by our method was 0.111 3, 0.091 0 and 0.185 0 higher than the best results of CCA, JGRHML, SCM and JFSSL.[Limitations] Our method is not applicable to semi-supervised and unsupervised data.[Conclusions] The proposed method improves the accuracy of cross-modal retrieval effectively.

Key wordsCross-modal Retrieval      High-Order Semantic Correlation      Subspace Mapping     
Received: 05 August 2019      Published: 15 June 2020
ZTFLH:  TP393  
Corresponding Authors: Tian Xiaomeng     E-mail:

Cite this article:

Zhu Lu,Tian Xiaomeng,Cao Sainan,Liu Yuanyuan. Subspace Cross-modal Retrieval Based on High-Order Semantic Correlation. Data Analysis and Knowledge Discovery, 2020, 4(5): 84-91.

URL:     OR

The Model of Cross-modal Retrieval
The Framework of Subspace Cross-modal Retrieval Based on High-order Semantic Correlation
检索方法 图像检索文本 文本检索图像 检索平均值
CCA 0.254 9 0.184 6 0.219 8
JGRHML 0.283 0 0.211 9 0.247 5
SCM 0.350 1 0.249 6 0.299 9
JFSSL 0.306 3 0.227 5 0.266 9
OURS 0.418 4 0.403 9 0.411 2
MAP in Different Methods on Wiki Dataset
检索方法 图像检索文本 文本检索图像 检索平均值
CCA 0.217 8 0.182 4 0.200 1
JGRHML 0.342 5 0.286 6 0.314 6
SCM 0.374 6 0.290 2 0.332 4
JFSSL 0.403 5 0.374 7 0.389 1
OURS 0.497 5 0.462 8 0.480 1
MAP in Different Methods on NUS-WIDE Dataset
检索方法 图像检索文本 文本检索图像 检索平均值
CCA 0.122 0 0.120 7 0.121 4
JGRHML 0.460 1 0.362 9 0.411 5
SCM 0.633 5 0.621 0 0.627 3
JFSSL 0.812 6 0.776 5 0.794 6
OURS 0.983 9 0.975 2 0.979 6
MAP in Different Methods on XMedia Dataset
Precision-Recall Curve on Wiki Dataset
Precision-Recall Curve on NUS-WIDE Dataset
Precision-Recall Curve on XMedia Dataset
[1] Rasiwasia N, Pereira J C, Coviello E , et al. A New Approach to Cross-modal Multimedia Retrieval [C]// Proceedings of the ACM International Conference on Multimedia. ACM, 2010: 251-260.
[2] Peng Y, Huang X, Zhao Y. An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2018,28(9):2372-2385.
[3] Hardoon D R, Szedmak S, Shawe-Taylor J. Canonical Correlation Analysis: An Overview with Application to Learning Methods[J]. Neural Computation, 2004,16(12):2639-2664.
[4] Wei X, Croft W B . LDA-Based Document Models for Ad-Hoc Retrieval [C]// Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2006: 178-185.
[5] Zheng W, Zhou X, Zou C, et al. Facial Expression Recognition Using Kernel Canonical Correlation Analysis (KCCA )[J]. IEEE Transactions on Neural Networks, 2006,17(1):233-238.
[6] 李广丽, 刘斌, 朱涛, 等. 基于优选典型相关分量的跨媒体检索模型[J]. 山东大学学报: 工学版, 2018,48(5):42-50.
[6] ( Li Guangli, Liu Bin, Zhu Tao, et al. Cross-media Retrieval Model Based on Choosing Key Canonical Correlated Vectors[J]. Journal of Shandong University: Engineering Science, 2018,48(5):42-50.)
[7] Pereira J C, Coviello E, Doyle G, et al. On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014,36(3):521-535.
doi: 10.1109/TPAMI.2013.142
[8] Zhai X, Peng Y, Xiao J . Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval [C]// Proceedings of the 27th AAAI Conference on Artificial Intelligence. AAAI, 2013.
[9] 丁恒, 陆伟. 基于相关性的跨模态信息检索研究[J]. 现代图书情报技术, 2016(1):17-23.
[9] ( Ding Heng, Lu Wei. A Study on Correlation-based Cross-Modal Information Retrieval[J]. New Technology of Library and Information Service, 2016(1):17-23.)
[10] Zhai X, Peng Y, Xiao J. Learning Cross-Media Joint Representation with Sparse and Semisupervised Regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014,24(6):965-978.
doi: 10.1109/TCSVT.2013.2276704
[11] Wang K, He R, Wang W, et al. Joint Feature Selection and Subspace Learning for Cross-modal Retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016,38(10):2010-2023.
[12] 代刚, 张鸿. 基于语义相关性与拓扑关系的跨媒体检索算法[J]. 计算机应用, 2018,38(9):2529-2534.
[12] ( Dai Gang, Zhang Hong. Cross-media Retrieval Algorithm Based on Semantic Correlation and Topological Relationship[J]. Application Research of Computers, 2018,38(9):2529-2534.)
[13] Peng Y, Zhai X, Zhao Y, et al. Semi-Supervised Cross-Media Feature Learning with Unified Patch Graph Regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016,26(3):583-596.
[14] 卓昀侃, 綦金玮, 彭宇新. 跨媒体深层细粒度关联学习方法[J]. 软件学报, 2019,30(4):884-895.
[14] ( Zhuo Yunkan, Qi Jinwei, Peng Yuxin. Cross-media Deep Fine-grained Correlation Learning[J]. Journal of Software, 2019,30(4):884-895.)
[15] Deng C, Tang X, Yan Y, et al. Discriminative Dictionary Learning with Common Label Alignment for Cross-Modal Retrieval[J]. IEEE Transactions on Multimedia, 2016,18(2):208-218.
[16] Zhang L, Ma B, Li G, et al. Cross-modal Retrieval Using Multi-ordered Discriminative Structured Subspace Learning[J]. IEEE Transactions on Multimedia, 2017,19(6):1220-1233.
[17] Zhang L, Ma B, Li G, et al. Generalized Semi-supervised and Structured Subspace Learning for Cross-modal Retrieval[J]. IEEE Transactions on Multimedia, 2017,20(1):128-141.
[18] He R, Tan T, Wang L , et al. L(2,1) Regularized Correntropy for Robust Feature Selection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.
[19] Yang Y, Shen H, Ma Z . L2,1-norm Regularized Discriminative Feature Selection for Unsupervised Learning [C]// Proceedings of the 22nd International Joint Conference on Artificial Intelligence. AAAI, 2011.
[20] Nikolova M, Ng M K. Analysis of Half-Quadratic Minimization Methods for Signal and Image Recovery[J]. SIAM Journal on Scientific Computing, 2005,27(3):937-966.
[21] 张振亚, 王进, 程红梅, 等. 基于余弦相似度的文本空间索引方法研究[J]. 计算机科学, 2005,32(9):160-163.
[21] ( Zhang Zhenya, Wang Jin, Cheng Hongmei, et al. An Approach for Spatial Index for Text Information Based on Cosine Similarity[J]. Computer Science, 2005,32(9):160-163.)
[22] Chua T S, Tang J, Hong R , et al. NUS-WIDE: A Real-world Web Image Database from National University of Singapore [C]// Proceedings of the ACM International Conference on Image and Video Retrieval. ACM, 2009.
[1] Chen Jie,Ma Jing,Li Xiaofeng. Short-Text Classification Method with Text Features from Pre-trained Models[J]. 数据分析与知识发现, 2021, 5(9): 21-30.
[2] Li Wenna,Zhang Zhixiong. Research on Knowledge Base Error Detection Method Based on Confidence Learning[J]. 数据分析与知识发现, 2021, 5(9): 1-9.
[3] Sun Yu, Qiu Jiangnan. Research on Influence of Opinion Leaders Based on Network Analysis and Text Mining [J]. 数据分析与知识发现, 0, (): 1-.
[4] Wang Qinjie, Qin Chunxiu, Ma Xubu, Liu Huailiang, Xu Cunzhen. Recommending Scientific Literature Based on Author Preference and Heterogeneous Information Network[J]. 数据分析与知识发现, 2021, 5(8): 54-64.
[5] Li Wenna, Zhang Zhixiong. Entity Alignment Method for Different Knowledge Repositories with Joint Semantic Representation[J]. 数据分析与知识发现, 2021, 5(7): 1-9.
[6] Wang Hao, Lin Kerou, Meng Zhen, Li Xinlei. Identifying Multi-Type Entities in Legal Judgments with Text Representation and Feature Generation[J]. 数据分析与知识发现, 2021, 5(7): 10-25.
[7] Yang Hanxun, Zhou Dequn, Ma Jing, Luo Yongcong. Detecting Rumors with Uncertain Loss and Task-level Attention Mechanism[J]. 数据分析与知识发现, 2021, 5(7): 101-110.
[8] Xu Yuemei, Wang Zihou, Wu Zixin. Predicting Stock Trends with CNN-BiLSTM Based Multi-Feature Integration Model[J]. 数据分析与知识发现, 2021, 5(7): 126-138.
[9] Huang Mingxuan,Jiang Caoqing,Lu Shoudong. Expanding Queries Based on Word Embedding and Expansion Terms[J]. 数据分析与知识发现, 2021, 5(6): 115-125.
[10] Wang Xiwei,Jia Ruonan,Wei Yanan,Zhang Liu. Clustering User Groups of Public Opinion Events from Multi-dimensional Social Network[J]. 数据分析与知识发现, 2021, 5(6): 25-35.
[11] Ruan Xiaoyun,Liao Jianbin,Li Xiang,Yang Yang,Li Daifeng. Interpretable Recommendation of Reinforcement Learning Based on Talent Knowledge Graph Reasoning[J]. 数据分析与知识发现, 2021, 5(6): 36-50.
[12] Liu Tong,Liu Chen,Ni Weijian. A Semi-Supervised Sentiment Analysis Method for Chinese Based on Multi-Level Data Augmentation[J]. 数据分析与知识发现, 2021, 5(5): 51-58.
[13] Chen Wenjie,Wen Yi,Yang Ning. Fuzzy Overlapping Community Detection Algorithm Based on Node Vector Representation[J]. 数据分析与知识发现, 2021, 5(5): 41-50.
[14] Zhang Guobiao,Li Jie. Detecting Social Media Fake News with Semantic Consistency Between Multi-model Contents[J]. 数据分析与知识发现, 2021, 5(5): 21-29.
[15] Yan Qiang,Zhang Xiaoyan,Zhou Simin. Extracting Keywords Based on Sememe Similarity[J]. 数据分析与知识发现, 2021, 5(4): 80-89.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938