Please wait a minute...
Data Analysis and Knowledge Discovery  2020, Vol. 4 Issue (5): 84-91    DOI: 10.11925/infotech.2096-3467.2019.0912
Current Issue | Archive | Adv Search |
Subspace Cross-modal Retrieval Based on High-Order Semantic Correlation
Zhu Lu,Tian Xiaomeng(),Cao Sainan,Liu Yuanyuan
School of Information Engineering, East China Jiaotong University, Nanchang 330000, China
Download: PDF(1210 KB)   HTML ( 5
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper converts the heterogeneous multi-modal data into isomorphism, aiming to address the semantic gaps and improve the accuracy of cross-modal retrieval.[Methods] First, we decided the high-order semantic correlation between multi-modal data. Then, we combined the annotation and the structure information of multi-modal data. Finally, we transformed the data of different modals into isomorphism for direct retrieval.[Results] We examined our method with three open datasets of WIKI, NUS-WIDE and XMedia. The average MAP value obtained by our method was 0.111 3, 0.091 0 and 0.185 0 higher than the best results of CCA, JGRHML, SCM and JFSSL.[Limitations] Our method is not applicable to semi-supervised and unsupervised data.[Conclusions] The proposed method improves the accuracy of cross-modal retrieval effectively.

Key wordsCross-modal Retrieval      High-Order Semantic Correlation      Subspace Mapping     
Received: 05 August 2019      Published: 15 June 2020
ZTFLH:  TP393  
Corresponding Authors: Tian Xiaomeng     E-mail: tianxiaomeng2016@126.com

Cite this article:

Zhu Lu,Tian Xiaomeng,Cao Sainan,Liu Yuanyuan. Subspace Cross-modal Retrieval Based on High-Order Semantic Correlation. Data Analysis and Knowledge Discovery, 2020, 4(5): 84-91.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2019.0912     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2020/V4/I5/84

The Model of Cross-modal Retrieval
The Framework of Subspace Cross-modal Retrieval Based on High-order Semantic Correlation
检索方法 图像检索文本 文本检索图像 检索平均值
CCA 0.254 9 0.184 6 0.219 8
JGRHML 0.283 0 0.211 9 0.247 5
SCM 0.350 1 0.249 6 0.299 9
JFSSL 0.306 3 0.227 5 0.266 9
OURS 0.418 4 0.403 9 0.411 2
MAP in Different Methods on Wiki Dataset
检索方法 图像检索文本 文本检索图像 检索平均值
CCA 0.217 8 0.182 4 0.200 1
JGRHML 0.342 5 0.286 6 0.314 6
SCM 0.374 6 0.290 2 0.332 4
JFSSL 0.403 5 0.374 7 0.389 1
OURS 0.497 5 0.462 8 0.480 1
MAP in Different Methods on NUS-WIDE Dataset
检索方法 图像检索文本 文本检索图像 检索平均值
CCA 0.122 0 0.120 7 0.121 4
JGRHML 0.460 1 0.362 9 0.411 5
SCM 0.633 5 0.621 0 0.627 3
JFSSL 0.812 6 0.776 5 0.794 6
OURS 0.983 9 0.975 2 0.979 6
MAP in Different Methods on XMedia Dataset
Precision-Recall Curve on Wiki Dataset
Precision-Recall Curve on NUS-WIDE Dataset
Precision-Recall Curve on XMedia Dataset
[1] Rasiwasia N, Pereira J C, Coviello E , et al. A New Approach to Cross-modal Multimedia Retrieval [C]// Proceedings of the ACM International Conference on Multimedia. ACM, 2010: 251-260.
[2] Peng Y, Huang X, Zhao Y. An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2018,28(9):2372-2385.
[3] Hardoon D R, Szedmak S, Shawe-Taylor J. Canonical Correlation Analysis: An Overview with Application to Learning Methods[J]. Neural Computation, 2004,16(12):2639-2664.
[4] Wei X, Croft W B . LDA-Based Document Models for Ad-Hoc Retrieval [C]// Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2006: 178-185.
[5] Zheng W, Zhou X, Zou C, et al. Facial Expression Recognition Using Kernel Canonical Correlation Analysis (KCCA )[J]. IEEE Transactions on Neural Networks, 2006,17(1):233-238.
[6] 李广丽, 刘斌, 朱涛, 等. 基于优选典型相关分量的跨媒体检索模型[J]. 山东大学学报: 工学版, 2018,48(5):42-50.
[6] ( Li Guangli, Liu Bin, Zhu Tao, et al. Cross-media Retrieval Model Based on Choosing Key Canonical Correlated Vectors[J]. Journal of Shandong University: Engineering Science, 2018,48(5):42-50.)
[7] Pereira J C, Coviello E, Doyle G, et al. On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014,36(3):521-535.
doi: 10.1109/TPAMI.2013.142
[8] Zhai X, Peng Y, Xiao J . Heterogeneous Metric Learning with Joint Graph Regularization for Cross-Media Retrieval [C]// Proceedings of the 27th AAAI Conference on Artificial Intelligence. AAAI, 2013.
[9] 丁恒, 陆伟. 基于相关性的跨模态信息检索研究[J]. 现代图书情报技术, 2016(1):17-23.
[9] ( Ding Heng, Lu Wei. A Study on Correlation-based Cross-Modal Information Retrieval[J]. New Technology of Library and Information Service, 2016(1):17-23.)
[10] Zhai X, Peng Y, Xiao J. Learning Cross-Media Joint Representation with Sparse and Semisupervised Regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014,24(6):965-978.
doi: 10.1109/TCSVT.2013.2276704
[11] Wang K, He R, Wang W, et al. Joint Feature Selection and Subspace Learning for Cross-modal Retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016,38(10):2010-2023.
[12] 代刚, 张鸿. 基于语义相关性与拓扑关系的跨媒体检索算法[J]. 计算机应用, 2018,38(9):2529-2534.
[12] ( Dai Gang, Zhang Hong. Cross-media Retrieval Algorithm Based on Semantic Correlation and Topological Relationship[J]. Application Research of Computers, 2018,38(9):2529-2534.)
[13] Peng Y, Zhai X, Zhao Y, et al. Semi-Supervised Cross-Media Feature Learning with Unified Patch Graph Regularization[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2016,26(3):583-596.
[14] 卓昀侃, 綦金玮, 彭宇新. 跨媒体深层细粒度关联学习方法[J]. 软件学报, 2019,30(4):884-895.
[14] ( Zhuo Yunkan, Qi Jinwei, Peng Yuxin. Cross-media Deep Fine-grained Correlation Learning[J]. Journal of Software, 2019,30(4):884-895.)
[15] Deng C, Tang X, Yan Y, et al. Discriminative Dictionary Learning with Common Label Alignment for Cross-Modal Retrieval[J]. IEEE Transactions on Multimedia, 2016,18(2):208-218.
[16] Zhang L, Ma B, Li G, et al. Cross-modal Retrieval Using Multi-ordered Discriminative Structured Subspace Learning[J]. IEEE Transactions on Multimedia, 2017,19(6):1220-1233.
[17] Zhang L, Ma B, Li G, et al. Generalized Semi-supervised and Structured Subspace Learning for Cross-modal Retrieval[J]. IEEE Transactions on Multimedia, 2017,20(1):128-141.
[18] He R, Tan T, Wang L , et al. L(2,1) Regularized Correntropy for Robust Feature Selection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012.
[19] Yang Y, Shen H, Ma Z . L2,1-norm Regularized Discriminative Feature Selection for Unsupervised Learning [C]// Proceedings of the 22nd International Joint Conference on Artificial Intelligence. AAAI, 2011.
[20] Nikolova M, Ng M K. Analysis of Half-Quadratic Minimization Methods for Signal and Image Recovery[J]. SIAM Journal on Scientific Computing, 2005,27(3):937-966.
[21] 张振亚, 王进, 程红梅, 等. 基于余弦相似度的文本空间索引方法研究[J]. 计算机科学, 2005,32(9):160-163.
[21] ( Zhang Zhenya, Wang Jin, Cheng Hongmei, et al. An Approach for Spatial Index for Text Information Based on Cosine Similarity[J]. Computer Science, 2005,32(9):160-163.)
[22] Chua T S, Tang J, Hong R , et al. NUS-WIDE: A Real-world Web Image Database from National University of Singapore [C]// Proceedings of the ACM International Conference on Image and Video Retrieval. ACM, 2009.
[1] Shen Zhe, Wang Yi, Yao Yifan, Cheng Ying. Author Name Disambiguation Techniques for the Academic Literature: a Review [J]. 数据分析与知识发现, 0, (): 1-.
[2] Zhao Ping,Sun Lianying,Tu Shuai,Bian Jianling,Wan Ying. Identifying Scenic Spot Entities Based on Improved Knowledge Transfer[J]. 数据分析与知识发现, 2020, 4(5): 118-126.
[3] Li Chengliang,Zhao Zhongying,Li Chao,Qi Liang,Wen Yan. Extracting Product Properties with Dependency Relationship Embedding and Conditional Random Field[J]. 数据分析与知识发现, 2020, 4(5): 54-65.
[4] Ye Guanghui,Zeng Jieyan,Hu Jinglan,Bi Chongwu. Analyzing Public Sentiments from the Perspective of City Profiles[J]. 数据分析与知识发现, 2020, 4(4): 15-26.
[5] Yu Chuanming,Yuan Sai,Zhu Xingyu,Lin Hongjun,Zhang Puliang,An Lu. Research on Deep Learning Based Topic Representation of Hot Events[J]. 数据分析与知识发现, 2020, 4(4): 1-14.
[6] Liang Ye, Li Xiaoyuan, Xu Hang, Hu Yiran. CLOpin: A Cross-lingual Knowledge Graph Framework for Public Opinion Analysis and Early Warning [J]. 数据分析与知识发现, 0, (): 1-.
[7] Yu Fengchang, Lu Wei. A Data Set Construction Method for The Location Annotation of Academic Literature Figure and Table [J]. 数据分析与知识发现, 0, (): 1-.
[8] Zeng Zhen, Li Gang, Mao Jin, Chen Jinghao. Research on Regional Public Security Data Governance and Process Domain Ontology [J]. 数据分析与知识发现, 0, (): 1-.
[9] Gong Lijuan,Wang Hao,Zhang Zixuan,Zhu Liping. Reducing Dimensions of Custom Declaration Texts with Word2Vec[J]. 数据分析与知识发现, 2020, 4(2/3): 89-100.
[10] Xiang Fei,Xie Yaotan. Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning[J]. 数据分析与知识发现, 2020, 4(2/3): 39-47.
[11] Zhong Lizhen,Ma Minshu,Zhou Changfeng. Forecasting Airfare Based on Route Characteristics[J]. 数据分析与知识发现, 2020, 4(2/3): 192-199.
[12] Wei Wei,Guo Chonghui,Xing Xiaoyu. Annotating Knowledge Points & Recommending Questions Based on Semantic Association Rules[J]. 数据分析与知识发现, 2020, 4(2/3): 182-191.
[13] Xu Yuemei,Liu Yunwen,Cai Lianqiao. Predicitng Retweets of Government Microblogs with Deep-combined Features[J]. 数据分析与知识发现, 2020, 4(2/3): 18-28.
[14] Gao Yuan,Shi Yuanlei,Zhang Lei,Cao Tianyi,Feng Jun. Reconstructing Tour Routes Based on Travel Notes[J]. 数据分析与知识发现, 2020, 4(2/3): 165-172.
[15] Hu Yongjun,Wei Tingting,Dou Zixin,Huang Yunyin,Liang Ruicheng,Chang Huiyou. Tech-Development Path of Knife-Scissor Industry in Guangdong with TRIZ Analysis of Patents[J]. 数据分析与知识发现, 2020, 4(2/3): 101-109.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn