|
|
Cross-media Fusion Method Based on LDA2Vec and Residual Network |
Qinghong Zhong1,2,Xiaodong Qiao1,Yunliang Zhang1,2(),Mengjuan Weng1,2 |
1Institute of Scientific and Technical Information of China, Beijing 100038, China 2Key Laboratory of Rich-media Knowledge Organization and Service of Digital Publishing Content, Beijing 100038, China |
|
|
Abstract [Objective] This paper optimizes feature extraction based on the theory of cross-media fusion mechanism, aiming to reduce the semantic gaps between heterogeneous data. [Methods] With the help of LDA2Vec and ResNet V2 models, we extracted features from the texts and images. Then, we used semantic association matching technique to map the heterogeneous text / image features to the consistent expression space. [Results] Compared with the performance of the LDA and SIFT algorithms, the proposed method increased the MAP value of text / image mutual retrieval to 0.454. [Limitations] The size of training sets needs to be expanded and extracting the optimization features has limited impacts on cross-media fusion. [Conclusions] The proposed method is effective and provides new directions for cross-media studies.
|
Received: 14 January 2019
Published: 25 November 2019
|
|
Corresponding Authors:
Yunliang Zhang
E-mail: zhangyl@istic.ac.cn
|
[1] |
潘刚, 张运良, 钟庆虹 . 工程科技领域知识服务的思考与实践[J]. 情报工程, 2018,4(5):4-12.
|
[1] |
( Pan Gang, Zhang Yunliang, Zhong Qinghong . Thinking and Practice of Knowledge Services in Engineering Field[J]. Technology Intelligence Engineering, 2018,4(5):4-12.)
|
[2] |
杨毅 . 跨媒体信息技术与应用[M]. 北京: 电子工业出版社, 2014: 3-10.
|
[2] |
( Yang Yi. Cross-media Information Technology and Application [M]. Beijing: Publishing House of Electronics Industry, 2014: 3-10.)
|
[3] |
谢毓湘, 栾悉道, 吴玲达 . 多媒体数据语义鸿沟问题分析[J]. 武汉理工大学学报: 信息与管理工程版, 2011,33(6):859-863.
|
[3] |
( Xie Yuxiang, Luan Xidao, Wu Lingda . Multimedia Data Semantic Gap Analysis[J]. Journal of Wuhan University of Technology: IAME, 2011,33(6):859-863.)
|
[4] |
赵学义, 李玺, 张仲非 . 基于多标签关系的多媒体信息检索[C]// 见: 浙江省信号处理学会2015年学术年会论文集. 杭州: 浙江大学出版社, 2015.
|
[4] |
( Zhao Xueyi, Li Xi, Zhang Zhongfei. Multimedia Information Retrieval Based on Multi-label Relationship[C]// Proceedings of the 2015 Annual Conference of the Signal Processing Society of Zhejiang Province. Hangzhou: Zhejiang University Press, 2015.)
|
[5] |
Rasiwasia N, Pereira J C, Coviello E , et al. A New Approach to Cross-Modal Multimedia Retrieval [C]// Proceedings of International Conference on Multimedia. Firenze: ACM, 2010: 251-260.
|
[6] |
庄凌, 庄越挺, 吴江琴 , 等. 一种基于稀疏典型性相关分析的图像检索方法[J]. 软件学报, 2012,23(5):1295-1304.
|
[6] |
( Zhuang Ling, Zhuang Yueting, Wu Jiangqin , et al. Image Retrieval Approach Based on Sparse Canonical Correlation Analysis[J]. Journal of Software, 2012,23(5):1295-1304.)
|
[7] |
李向阳, 庄越挺, 潘云鹤 . 基于内容的图像检索技术与系统[J]. 计算机研究与发展, 2001,38(3):344-354.
|
[7] |
( Li Xiangyang, Zhuang Yueting, Pan Yunhe . The Technique and Systems of Content-based Image Retrieval[J]. Journal of Computer Research and Development, 2001,38(3):344-354.)
|
[8] |
庄越挺 . 智能多媒体信息分析与检索的研究[D]. 杭州: 浙江大学, 1998.
|
[8] |
( Zhuang Yueting . Research on Intelligent Multimedia Information Analysis and Retrieval[D]. Hangzhou: Zhejiang University, 1998.)
|
[9] |
魏云超 . 跨媒体数据的语义分类和检索[D]. 北京: 北京交通大学, 2016.
|
[9] |
( Wei Yunchao . Semantic Classification and Retrieval of Cross-media Data[D]. Beijing: Beijing Jiaotong University, 2016.)
|
[10] |
鹿鹏, 庄敏, 龙刚 , 等. 文本特征提取研究现状分析与展望[J]. 科技创新与品牌, 2017(4):70-74.
|
[10] |
( Lu Peng, Zhuang Min, Long Gang . Analysis and Prospect of Research on Text Feature Extraction[J]. Technological Innovation and Brand, 2017(4):70-74.)
|
[11] |
陈磊, 李俊 . 基于词向量的文本特征选择方法研究[J]. 小型微型计算机系统, 2018,39(5):129-132.
|
[11] |
( Chen Lei, Li Jun . Research on Text Feature Selection Method Based on Word Vector[J]. Journal of Chinese Computer Systems, 2018,39(5):129-132.)
|
[12] |
陈婧琳 . 基于特征学习和关联学习的在线商品跨媒体检索研究[D]. 南昌: 华东交通大学, 2016.
|
[12] |
( Chen Jinglin . Research on Cross-media Retrieval of Online Product Based on Feature Learning and Association Learning[D]. Nanchang: East China Jiaotong University, 2016.)
|
[13] |
Mikolov T, Sutskever I, Kai C , et al. Distributed Representations of Words and Phrases and Their Compositionality[J]. Advances in Neural Information Processing Systems, 2013,26:3111-3119.
|
[14] |
Blei D M, Ng A Y, Jordan M I . Latent Dirichlet Allocation[J]. Journal of Machine Learning Research, 2003,3:993-1022.
|
[15] |
Moody C E . Mixing Dirichlet Topic Models and Word Embeddings to Make Lda2vec[OL]. arxiv Preprint, arxiv: 1605. 02019.
|
[16] |
翟俊海, 赵文秀, 王熙照 . 图像特征提取研究[J]. 河北大学学报: 自然科学版, 2009,29(1):106-112.
|
[16] |
( Zhai Junhai, Zhao Wenxiu, Wang Xizhao . Research on the Image Feature Extraction[J]. Journal of Hebei University: Natural Science Edition, 2009,29(1):106-112.)
|
[17] |
常芳, 尚振宏, 刘辉 , 等. 一种基于颜色特征的自适应目标跟踪算法[J]. 信息技术, 2018(3):10-14.
|
[17] |
( Chang Fang, Shang Zhenhong, Liu Hui . An Adaptive Target Tracking Algorithm Based on Color Features[J]. Information Technology, 2018(3):10-14.)
|
[18] |
叶雨晴, 邱晓晖 . 基于SIFT与K-means的图像复制粘贴篡改检测[J]. 计算机技术与发展, 2018,28(6):121-124.
|
[18] |
( Ye Yuqing, Qiu Xiaohui . Image Copy and Paste Tamper Detection Based on SIFT and K-means[J]. Computer Technology and Development, 2018,28(6):121-124.)
|
[19] |
Krizhevsky A, Sutskever I, Hinton G E. ImageNet Classification with Deep Convolutional Neural Networks [C]// Proceedings of International Conference on Neural Information Processing Systems. Lake Tahoe: NIPS, 2012: 84-90.
|
[20] |
黄潇, 谷硕, 马晓晔 , 等. 人工智能糖网眼底图像识别在真实世界的应用[J]. 情报工程, 2018,4(1):24-30.
|
[20] |
( Huang Xiao, Gu Shuo, Ma Xiaoye , et al. Artificial Intelligence of Diabetic Retinopathy Image Recognition Used in the Real World[J]. Technology Intelligence Engineering, 2018,4(1):24-30.)
|
[21] |
丁亮, 姚长青, 何彦青 , 等. 深度学习在统计机器翻译领域自适应中的应用研究[J]. 情报工程, 2017,3(3):64-76.
|
[21] |
( Ding Liang, Yao Changqing, He Yanqing , et al. Application of Deep Learning in Statistical Machine Translation Domain Adaptation[J]. Technology Intelligence Engineering, 2017,3(3):64-76.)
|
[22] |
孙胜利, 赵丹新 . 基于ResNet的遥感图像飞机目标检测新方法[J]. 电子设计工程, 2018,26(22):164-168.
|
[22] |
( Sun Shengli, Zhao Danxin . A New Method of Aircraft Target Detection Based on ResNet for Remote Sensing Images[J]. Electronic Design Engineering, 2018,26(22):164-168.)
|
[23] |
He K, Zhang X, Ren S , et al. Deep Residual Learning for Image Recognition[OL]. arXiv Preprint, arXiv: 1512. 03385.
|
[24] |
师少杰 . 典型相关分析: 在机器学习方法上应用的概述[D]. 北京: 北京交通大学, 2012.
|
[24] |
( Shi Shaojie . Canonical Correlation Analysis: An Overview of Application on Machine Learning Methods[D]. Beijing: Beijing Jiaotong University, 2012.)
|
[25] |
刘瑶 . 融合CCA和Adaboost的跨模态多媒体信息检索[D]. 重庆: 西南大学, 2016.
|
[25] |
( Liu Yao . Cross-modal Multimedia Information Retrieval with CCA and Adaboost[D]. Chongqing: Southwest University, 2016.)
|
[26] |
Andrew G, Arora R, Bilmes J , et al. Deep Canonical Correlation Analysis [C]// Proceedings of International Conference on Machine Learning. Atlanta: ICML, 2013: 1247-1255.
|
[27] |
Wei Y, Zhao Y, Lu C , et al. Cross-Modal Retrieval with CNN Visual Features: A New Baseline[J]. IEEE Transactions on Cybernetics, 2017,47(2):449-460.
|
[28] |
Huang X, Peng Y. Deep Cross-media Knowledge Transfer [C]// Proceedings of Conference on Computer Vision and Pattern Recognition. Salt Lake City: CVPR, 2018: 8837-8846.
|
[29] |
Qi J, Peng Y. Cross-modal Bidirectional Translation via Reinforcement Learning [C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm: IJCAI, 2018: 2630-2636.
|
[30] |
最全英文停用词表整理( 891个)[EB/OL]. [2018-10-03].
|
[30] |
( The Most Complete English Stop Word List (891)[EB/OL]. [2018-10-03]. .)
|
[31] |
邹辉 . 基于深度学习与中心相关性度量算法的跨媒体检索方法研究[D]. 厦门: 华侨大学, 2016.
|
[31] |
( Zou Hui . A Cross-Modal Multimedia Retrieval Method Research Based on Deep Learning and Centered Correlation[D]. Xiamen: Huaqiao University, 2016.)
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|