Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (5): 84-91    DOI: 10.11925/infotech.2096-3467.2019.0912
Current Issue | Archive | Adv Search |
Subspace Cross-modal Retrieval Based on High-Order Semantic Correlation
Zhu Lu,Tian Xiaomeng(),Cao Sainan,Liu Yuanyuan
School of Information Engineering, East China Jiaotong University, Nanchang 330000, China
Download: PDF (1210 KB)   HTML ( 6
Export: BibTeX | EndNote (RIS)      

[Objective] This paper converts the heterogeneous multi-modal data into isomorphism, aiming to address the semantic gaps and improve the accuracy of cross-modal retrieval.[Methods] First, we decided the high-order semantic correlation between multi-modal data. Then, we combined the annotation and the structure information of multi-modal data. Finally, we transformed the data of different modals into isomorphism for direct retrieval.[Results] We examined our method with three open datasets of WIKI, NUS-WIDE and XMedia. The average MAP value obtained by our method was 0.111 3, 0.091 0 and 0.185 0 higher than the best results of CCA, JGRHML, SCM and JFSSL.[Limitations] Our method is not applicable to semi-supervised and unsupervised data.[Conclusions] The proposed method improves the accuracy of cross-modal retrieval effectively.

Key wordsCross-modal Retrieval      High-Order Semantic Correlation      Subspace Mapping     
Received: 05 August 2019      Published: 15 June 2020
ZTFLH:  TP393  
Corresponding Authors: Tian Xiaomeng     E-mail:

Cite this article:

Zhu Lu,Tian Xiaomeng,Cao Sainan,Liu Yuanyuan. Subspace Cross-modal Retrieval Based on High-Order Semantic Correlation. Data Analysis and Knowledge Discovery, 2020, 4(5): 84-91.

URL:     OR

The Model of Cross-modal Retrieval
The Framework of Subspace Cross-modal Retrieval Based on High-order Semantic Correlation
检索方法 图像检索文本 文本检索图像 检索平均值
CCA 0.254 9 0.184 6 0.219 8
JGRHML 0.283 0 0.211 9 0.247 5
SCM 0.350 1 0.249 6 0.299 9
JFSSL 0.306 3 0.227 5 0.266 9
OURS 0.418 4 0.403 9 0.411 2
MAP in Different Methods on Wiki Dataset
检索方法 图像检索文本 文本检索图像 检索平均值
CCA 0.217 8 0.182 4 0.200 1
JGRHML 0.342 5 0.286 6 0.314 6
SCM 0.374 6 0.290 2 0.332 4
JFSSL 0.403 5 0.374 7 0.389 1
OURS 0.497 5 0.462 8 0.480 1
MAP in Different Methods on NUS-WIDE Dataset
检索方法 图像检索文本 文本检索图像 检索平均值
CCA 0.122 0 0.120 7 0.121 4
JGRHML 0.460 1 0.362 9 0.411 5
SCM 0.633 5 0.621 0 0.627 3
JFSSL 0.812 6 0.776 5 0.794 6
OURS 0.983 9 0.975 2 0.979 6
MAP in Different Methods on XMedia Dataset
Precision-Recall Curve on Wiki Dataset
Precision-Recall Curve on NUS-WIDE Dataset
Precision-Recall Curve on XMedia Dataset
[1] Rasiwasia N, Pereira J C, Coviello E , et al. A New Approach to Cross-modal Multimedia Retrieval [C]// Proceedings of the ACM International Conference on Multimedia. ACM, 2010: 251-260.
[2] Peng Y, Huang X, Zhao Y. An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2018,28(9):2372-2385.
[3] Hardoon D R, Szedmak S, Shawe-Taylor J. Canonical Correlation Analysis: An Overview with Application to Learning Methods[J]. Neural Computation, 2004,16(12):2639-2664.
[4] Wei X, Croft W B . LDA-Based Document Models for Ad-Hoc Retrieval [C]// Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2006: 178-185.
[5] Zheng W, Zhou X, Zou C, et al. Facial Expression Recognition Using Kernel Canonical Correlation Analysis (KCCA )[J]. IEEE Transactions on Neural Networks, 2006,17(1):233-238.
[6] 李广丽, 刘斌, 朱涛, 等. 基于优选典型相关分量的跨媒体检索模型[J]. 山东大学学报: 工学版, 2018,48(5):42-50.
[6] ( Li Guangli, Liu Bin, Zhu Tao, et al. Cross-media Retrieval Model Based on Choosing Key Canonical Correlated Vectors[J]. Journal of Shandong University: Engineering Science, 2018,48(5):42-50.)
[7] Pereira J C, Coviello E, Doyle G, et al. On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014,36(3):521-535.
doi: 10.1109/TPAMI.2013.142
[8] Zhai X, Peng Y, Xiao J . Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval [C]// Proceedings of the 27th AAAI Conference on Artificial Intelligence. AAAI, 2013.
[9] 丁恒, 陆伟. 基于相关性的跨模态信息检索研究[J]. 现代图书情报技术, 2016(1):17-23.
[9] ( Ding Heng, Lu Wei. A Study on Correlation-based Cross-Modal Information Retrieval[J]. New Technology of Library and Information Service, 2016(1):17-23.)
[10] Zhai X, Peng Y, Xiao J. Learning Cross-Media Joint Representation with Sparse and Semisupervised Regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014,24(6):965-978.
doi: 10.1109/TCSVT.2013.2276704
[11] Wang K, He R, Wang W, et al. Joint Feature Selection and Subspace Learning for Cross-modal Retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016,38(10):2010-2023.
[12] 代刚, 张鸿. 基于语义相关性与拓扑关系的跨媒体检索算法[J]. 计算机应用, 2018,38(9):2529-2534.
[12] ( Dai Gang, Zhang Hong. Cross-media Retrieval Algorithm Based on Semantic Correlation and Topological Relationship[J]. Application Research of Computers, 2018,38(9):2529-2534.)
[13] Peng Y, Zhai X, Zhao Y, et al. Semi-Supervised Cross-Media Feature Learning with Unified Patch Graph Regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016,26(3):583-596.
[14] 卓昀侃, 綦金玮, 彭宇新. 跨媒体深层细粒度关联学习方法[J]. 软件学报, 2019,30(4):884-895.
[14] ( Zhuo Yunkan, Qi Jinwei, Peng Yuxin. Cross-media Deep Fine-grained Correlation Learning[J]. Journal of Software, 2019,30(4):884-895.)
[15] Deng C, Tang X, Yan Y, et al. Discriminative Dictionary Learning with Common Label Alignment for Cross-Modal Retrieval[J]. IEEE Transactions on Multimedia, 2016,18(2):208-218.
[16] Zhang L, Ma B, Li G, et al. Cross-modal Retrieval Using Multi-ordered Discriminative Structured Subspace Learning[J]. IEEE Transactions on Multimedia, 2017,19(6):1220-1233.
[17] Zhang L, Ma B, Li G, et al. Generalized Semi-supervised and Structured Subspace Learning for Cross-modal Retrieval[J]. IEEE Transactions on Multimedia, 2017,20(1):128-141.
[18] He R, Tan T, Wang L , et al. L(2,1) Regularized Correntropy for Robust Feature Selection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.
[19] Yang Y, Shen H, Ma Z . L2,1-norm Regularized Discriminative Feature Selection for Unsupervised Learning [C]// Proceedings of the 22nd International Joint Conference on Artificial Intelligence. AAAI, 2011.
[20] Nikolova M, Ng M K. Analysis of Half-Quadratic Minimization Methods for Signal and Image Recovery[J]. SIAM Journal on Scientific Computing, 2005,27(3):937-966.
[21] 张振亚, 王进, 程红梅, 等. 基于余弦相似度的文本空间索引方法研究[J]. 计算机科学, 2005,32(9):160-163.
[21] ( Zhang Zhenya, Wang Jin, Cheng Hongmei, et al. An Approach for Spatial Index for Text Information Based on Cosine Similarity[J]. Computer Science, 2005,32(9):160-163.)
[22] Chua T S, Tang J, Hong R , et al. NUS-WIDE: A Real-world Web Image Database from National University of Singapore [C]// Proceedings of the ACM International Conference on Image and Video Retrieval. ACM, 2009.
[1] Zhang Chunjin, Guo Shenghui, Ji Shujuan, Yang Wei, Yi Lei . The Group recommendation algorithms based on implicit representation learning of multi-attribute ratings [J]. 数据分析与知识发现, 0, (): 1-.
[2] Sifan Zhang, Zhendong Niu, Hao Lu, Yifan Zhu, Rongrong Wang. Graph Convolution Embedding and Feature Cross Based Literature Citation Prediction Method:Taking the Transportation Field as An Example [J]. 数据分析与知识发现, 0, (): 1-.
[3] Zhang Sifan,Niu Zhendong,Lu Hao,Zhu Yifan,Wang Rongrong. Predicting Citations Based on Graph Convolution Embedding and Feature Cross:Case Study of Transportation Research[J]. 数据分析与知识发现, 2020, 4(9): 56-67.
[4] Zeng Zhen,Li Gang,Mao Jin,Chen Jinghao. Data Governance and Domain Ontology of Regional Public Security[J]. 数据分析与知识发现, 2020, 4(9): 41-55.
[5] Wen Pingmei,Ye Zhiwei,Ding Wenjian,Liu Ying,Xu Jian. Developments of Named Entity Disambiguation[J]. 数据分析与知识发现, 2020, 4(9): 15-25.
[6] Huang Lu,Zhou Enguo,Li Daifeng. Text Representation Learning Model Based on Attention Mechanism with Task-specific Information[J]. 数据分析与知识发现, 2020, 4(9): 111-122.
[7] Liu Qian, Li Chenliang. A Survey of Topic Evolution on Social Media[J]. 数据分析与知识发现, 2020, 4(8): 1-14.
[8] Shen Zhe, Wang Yi, Yao Yifan, Cheng Ying. Author Name Disambiguation Techniques for Academic Literature: A Review[J]. 数据分析与知识发现, 2020, 4(8): 15-27.
[9] Sheng Jiaqi, Xu Xin. Expanding Scholar Labels with Research Similarity and Co-authorship Network[J]. 数据分析与知识发现, 2020, 4(8): 75-85.
[10] Chenglei Qin, Chengzhi Zhang. Using Hierarchical Attention Network Model to Recognize Structure Functions of Academic Articles [J]. 数据分析与知识发现, 0, (): 1-.
[11] Shen Zhihong,Zhao Zihao,Wang Haibo. Big Data Technology Stack Shifting: From SQL Centric to Graph Centric[J]. 数据分析与知识发现, 2020, 4(7): 50-65.
[12] Chen Dong,Wang Jiandong,Li Huiying,Cai Sihang,Huang Qianqian,Yi Chengqi,Cao Pan. Forecasting Poultry Turnovers with Machine Learning and Multiple Factors[J]. 数据分析与知识发现, 2020, 4(7): 18-27.
[13] Xu Yicong,Tian Xuedong,Li Xinfu,Yang Fang,Shi Qingxuan. Retrieving Mathematical Expressions Based on Hesitant Fuzzy Weight[J]. 数据分析与知识发现, 2020, 4(7): 118-126.
[14] Liang Ye,Li Xiaoyuan,Xu Hang,Hu Yiran. CLOpin: A Cross-Lingual Knowledge Graph Framework for Public Opinion Analysis and Early Warning[J]. 数据分析与知识发现, 2020, 4(6): 1-14.
[15] Liu Weijiang,Wei Hai,Yun Tianhe. Evaluation Model for Customer Credits Based on Convolutional Neural Network[J]. 数据分析与知识发现, 2020, 4(6): 80-90.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938