|
|
A Multi-Label Classification Model with Two-Stage Transfer Learning |
Lu Quan1,2,He Chao1,Chen Jing3( ),Tian Min1,Liu Ting1 |
1Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China 2Big Data Research Institute, Wuhan University, Wuhan 430072, China 3School of Information Management, Central China Normal University, Wuhan 430079, China |
|
|
Abstract [Objective] This paper proposes a multi-label classification model, aiming to improve data sampling and add common characteristics of the existing models. [Methods] We constructed a two-stage migration learning model of “common domain - single tag data in the target domain - multiple tag data”. Then, we trained this model in the general and the target fields, as well as fine-tuned it with the single label data balanced with the over-sampling method. Finally, we migrated the model to multi-label data and generated multi-label classification. [Results] We examined the new model with image annotations from medical literature. On multi-label classification tasks for images and texts, the F1 score was improved by more than 50% compared to the one-stage transfer learning model. [Limitations] More research is needed to choose better basic model and sampling method for different tasks. [Conclusions] This proposed method coud be used in annotation, retrieval and utilization of big data sets with constraints.
|
Received: 27 November 2020
Published: 08 March 2021
|
|
Fund:National Natural Science Foundation of China(71921002);Construction Project of College of State Secrets of Wuhan University in 2020 |
Corresponding Authors:
Chen Jing,ORCID:0000-0002-6444-2962
E-mail: dancinglulu@sina.com
|
[1] |
Weiss G M, Provost F. Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction[J]. Journal of Artificial Intelligence Research, 2003, 19(1):315-354.
doi: 10.1613/jair.1199
|
[2] |
Wang Y Q, Yao Q M, Kwok J T, et al. Generalizing from a Few Examples: A Survey on Few-Shot learning[J]. ACM Computing Surveys (CSUR), 2020, 53(3):1-34.
|
[3] |
Yang Q, Wu X D. 10 Challenging Problems in Data Mining Research[J]. International Journal of Information Technology & Decision Making, 2006, 5(4):597-604.
|
[4] |
Tsoumakas G, Katakis I. Multi-Label Classification: An Overview[J]. International Journal of Data Warehousing and Mining, 2007, 3(3):1-13.
|
[5] |
Salakhutdinov R, Tenenbaum J, Torralba A. One-Shot Learning with a Hierarchical Nonparametric Bayesian Model[C]// Proceedings of ICML Workshop on Unsupervised and Transfer Leaning. 2012:195-206.
|
[6] |
Koch G, Zemel R, Salakhutdinov R. Siamese Neural Networks for One-Shot Image Recognition[C]// Proceedings of the 32nd International Conference on Machine Learning. 2015.
|
[7] |
Vinyals O, Blundell C, Lillicrap T, et al. Matching Networks for One Shot Learning[C]// Proceedings of the 30th Conference on Neural Information Processing Systems. 2016: 3630-3638.
|
[8] |
Snell J, Swersky K, Zemel R. Prototypical Networks for Few-Shot Learning[C]// Proceedings of the 31st Conference on Neural Information Processing Systems. 2017: 4077-4087.
|
[9] |
Pan S J, Yang Q. A Survey on Transfer Learning[J]. IEEE Transactions on Knowledge & Data Engineering, 2010, 22(10):1345-1359.
|
[10] |
庄福振, 罗平, 何清, 等. 迁移学习研究进展[J]. 软件学报, 2015, 26(1):26-39.
|
[10] |
(Zhuang Fuzhen, Luo Ping, He Qing, et al. Survey on Transfer Learning Research[J]. Journal of Software, 2015, 26(1):26-39.)
|
[11] |
Ng W W, Hu J, Yeung D S, et al. Diversified Sensitivity Based Under-Sampling for Imbalance Classification Problems[J]. IEEE Transactions on Cybernetics, 2017, 45(11):2402-2412.
doi: 10.1109/TCYB.2014.2372060
|
[12] |
赵清华, 张艺豪, 马建芬, 等. 改进SMOTE的非平衡数据集分类算法研究[J]. 计算机工程与应用, 2018, 54(18):168-173.
|
[12] |
(Zhao Qinghua, Zhang Yihao, Ma Jianfen, et al. Research on Classification Algorithm of Imbalanced Datasets Based on Improved SMOTE[J]. Computer Engineering and Applications, 2018, 54(18):168-173.)
|
[13] |
Cervantes J, Huang D S, Farid G L, et al. A Hybrid Algorithm to Improve the Accuracy of Support Vector Machines on Skewed Data Sets[M]. Berlin, German: Springer, 2014: 782-788.
|
[14] |
崔巍, 贾晓琳, 樊帅帅, 等. 一种新的不均衡关联分类算法[J]. 计算机科学, 2020, 47(S1):488-493.
|
[14] |
(Cui Wei, Jia Xiaolin, Fan Shuaishuai, et al. New Associative Classification Algorithm for Imbalanced Data[J]. Computer Science, 2020, 47(S1):488-493.)
|
[15] |
龙明盛. 迁移学习问题与方法研究[D]. 北京: 清华大学, 2014.
|
[15] |
(Long Mingsheng. Transfer Learning Problems and Methods[D]. Beijing: Tsinghua University, 2014.)
|
[16] |
Tan C Q, Sun F C, Kong T, et al. A Survey on Deep Transfer Learning[C]// Proceedings of International Conference on Artificial Neural Networks. 2018: 270-279.
|
[17] |
Yosinski J, Clune J, Bengio Y, et al. How Transferable are Features in Deep Neural Networks?[C]// Proceedings of the 27th Conference on Neural Information Processing Systems. 2014: 3320-3328.
|
[18] |
Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies-Volume 1. 2019: 4171-4186.
|
[19] |
李年华. 深度神经网络的迁移学习关键问题研究[D]. 成都: 电子科技大学, 2018.
|
[19] |
(Li Nianhua. Research on Key Problems of Transfer Learning in Deep Neural Networks[D]. Chengdu: University of Electronic Science and Technology of China, 2018.)
|
[20] |
朱怀涛. 面向小样本的多标签分类方法与应用研究[D]. 成都:电子科技大学, 2020.
|
[20] |
(Zhu Huaitao. Multi-label Classification Method and Application Research in Small Samples[D]. Chengdu: University of Electronic Science and Technology of China, 2020.)
|
[21] |
程磊, 吴晓富, 张索非. 数据集类别不平衡性对迁移学习的影响分析[J]. 信号处理, 2020, 36(1):110-117.
|
[21] |
(Cheng Lei, Wu Xiaofu, Zhang Suofei. Analysis of the Effect of Class Imbalance on Transfer Learning[J]. Journal of Signal Processing, 2020, 36(1):110-117.)
|
[22] |
Estabrooks A, Jo T, Japkowicz N. A Multiple Resampling Method for Learning from Imbalanced Data Sets[J]. Computational Intelligence, 2010, 20(1):18-36.
doi: 10.1111/coin.2004.20.issue-1
|
[23] |
Bunkhumpornpat C, Sinapiromsaran K, Lursinsap C. Safe-Level-Smote: Safe-Level-Synthetic Minority Over-Sampling Technique for Handling the Class Imbalanced Problem[C]// Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2009: 475-482.
|
[24] |
Dong A M, Chung F L, Wang S T. Semi-Supervised Classification Method Through Oversampling and Common Hidden Space[J]. Information Sciences, 2016, 349:216-228.
|
[25] |
Konno T, Iwazume M. Pseudo-Feature Generation for Imbalanced Data Analysis in Deep Learning[OL]. arXiv Preprint, arXiv:1807.06538.
|
[26] |
Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative Adversarial Nets[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014:2672-2680.
|
[27] |
于玉海, 林鸿飞, 孟佳娜, 等. 跨模态多标签生物医学图像分类建模识别[J]. 中国图象图形学报, 2018, 23(6):917-927.
|
[27] |
(Yu Yuhai, Lin Hongfei, Meng Jia’na, et al. Classification Modeling and Recognition for Cross Modal and Multi-Label Biomedical Image[J]. Journal of Image and Graphics, 2018, 23(6):917-927.)
|
[28] |
李思豪, 陈福才, 黄瑞阳. 一种多标签随机均衡采样算法[J]. 计算机应用研究, 2017, 34(10):2929-2932.
|
[28] |
(Li Sihao, Chen Fucai, Huang Ruiyang. Multi-label Random Balanced Resampling Algorithm[J]. Application Research of Computers, 2017, 34(10):2929-2932.)
|
[29] |
Russakovsky O, Deng J, Su H, et al. ImageNet Large Scale Visual Recognition Challenge[J]. International Journal of Computer Vision, 2014, 115(3):211-252.
doi: 10.1007/s11263-015-0816-y
|
[30] |
Mikolov T, Grave E, Bojanowski P, et al. Advances in Pre-Training Distributed Word Representations[OL]. arXiv Preprint, arXiv:1712.09405.
|
[31] |
ImageCLEFmed (2013, 2015, 2016) Dataset[DS/OL]. [2019-06-01]. https://www.imageclef.org/.
|
[32] |
陈健美, 宋顺林, 朱玉全, 等. 一种基于贝叶斯和神经网络的医学图像组合分类方法[J]. 计算机科学, 2008, 35(3):244-246.
|
[32] |
(Chen Jianmei, Song Shunlin, Zhu Yuquan, et al. A Method of Medical Images Combining Classification Based on Bayesian and Neural Network[J]. Computer Science, 2008, 35(3):244-246.)
|
[33] |
孙君顶, 李琳. 基于BP神经网络的医学图像分类[J]. 计算机系统应用, 2012, 21(3):160-162.
|
[33] |
(Sun Junding, Li Lin. Classification of Medical Image Based on BP Neural Network[J]. Computer Systems & Applications, 2012, 21(3):160-162.)
|
[34] |
Lecun Y, Bengio Y, Hinton G. Deep Learning[J]. Nature, 2015, 521(7553):436-444.
doi: 10.1038/nature14539
|
[35] |
Ratliff L J, Burden S A, Sastry S S. Characterization and Computation of Local Nash Equilibria in Continuous Games[C]// Proceedings of the 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton). 2013: 917-924.
|
[36] |
He K M, Zhang X Y, Ren S Q, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016: 770-778.
|
[37] |
BERT Model Weights[EB/OL]. [2019-06-01]. https://storage.googleapis.com/bert_models/2020_02_20/uncased_L-12_H-768_ A-12.zip.
|
[38] |
ResNet Model Weights [EB/OL]. [2019-06-01]. https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5.
|
[39] |
田敏. 基于迁移学习的医学文献内图像多标签分类[D]. 武汉: 武汉大学, 2019.
|
[39] |
(Tian Min. Image Multi-Label Classification in Medical Literature Based on Transfer Learning[D]. Wuhan: Wuhan University, 2019.)
|
[40] |
Tahir M A, Kittler J, Bouridane A. Multilabel Classification Using Heterogeneous Ensemble of Multi-Label Classifiers[J]. Pattern Recognition Letters, 2012, 33(5):513-523.
doi: 10.1016/j.patrec.2011.10.019
|
[41] |
Sun C, Qiu X P, Xu Y G, et al. How to Fine-Tune BERT for Text Classification?[C]// Proceedings of China National Conference on Chinese Computational Linguistics. 2019:194-206.
|
[42] |
Read J, Pfahringer B, Holmes G, et al. Classifier Chains for Multi-Label Classification[J]. Machine Learning, 2011, 85(3):333-359.
doi: 10.1007/s10994-011-5256-5
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|