Please wait a minute...
Data Analysis and Knowledge Discovery  2021, Vol. 5 Issue (7): 91-100    DOI: 10.11925/infotech.2096-3467.2020.1173
Current Issue | Archive | Adv Search |
A Multi-Label Classification Model with Two-Stage Transfer Learning
Lu Quan1,2,He Chao1,Chen Jing3(),Tian Min1,Liu Ting1
1Center for Studies of Information Resources, Wuhan University, Wuhan 430072, China
2Big Data Research Institute, Wuhan University, Wuhan 430072, China
3School of Information Management, Central China Normal University, Wuhan 430079, China
Download: PDF (827 KB)   HTML ( 24
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper proposes a multi-label classification model, aiming to improve data sampling and add common characteristics of the existing models. [Methods] We constructed a two-stage migration learning model of “common domain - single tag data in the target domain - multiple tag data”. Then, we trained this model in the general and the target fields, as well as fine-tuned it with the single label data balanced with the over-sampling method. Finally, we migrated the model to multi-label data and generated multi-label classification. [Results] We examined the new model with image annotations from medical literature. On multi-label classification tasks for images and texts, the F1 score was improved by more than 50% compared to the one-stage transfer learning model. [Limitations] More research is needed to choose better basic model and sampling method for different tasks. [Conclusions] This proposed method coud be used in annotation, retrieval and utilization of big data sets with constraints.

Key wordsMulti-Label Classification      Transfer Learning      Data Equalization      BERT Model      ResNet Model     
Received: 27 November 2020      Published: 08 March 2021
ZTFLH:  G203  
Fund:National Natural Science Foundation of China(71921002);Construction Project of College of State Secrets of Wuhan University in 2020
Corresponding Authors: Chen Jing,ORCID:0000-0002-6444-2962     E-mail: dancinglulu@sina.com

Cite this article:

Lu Quan, He Chao, Chen Jing, Tian Min, Liu Ting. A Multi-Label Classification Model with Two-Stage Transfer Learning. Data Analysis and Knowledge Discovery, 2021, 5(7): 91-100.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2020.1173     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2021/V5/I7/91

Multi-Label Classification Model Based on Two-Stage Transfer Learning
类序号 类代码 类名称 未均衡图像数量 均衡化后图像数量 未均衡文本数量 均衡化后文本数量
1 D3DR 三维重构图 369 200 67 301
2 DMEL 电子显微镜成像 367 200 66 301
3 DMFL 荧光显微镜成像 1 256 200 201 299
4 DMLI 光学显微镜成像 1 313 200 55 301
5 DMTR 透射显微镜成像 462 200 66 301
6 DRAN 血管造影术 165 200 19 301
7 DRCO 联合多种模式影像叠加图 73 200 281 174
8 DRCT 计算机化断层显像 431 200 174 301
9 DRMR 核磁共振影像 470 200 19 301
10 DRPE 正电子发射计算机断层显像 48 200 136 138
11 DRUS 超声波影像 300 200 391 301
12 DRXR X 光照相术 484 200 391 390
13 DSEC 心电图 143 200 117 301
14 DSEE 脑电图 41 200 29 301
15 DSEM 肌电图 30 200 18 301
16 DVDM 皮肤病影像 145 200 104 301
17 DVEN 内窥镜显像 108 200 75 301
18 DVOR 其他器官的影像 238 200 146 301
19 GCHE 化学结构图 156 200 69 255
20 GFIG 统计图表 5 243 200 186 301
21 GFLO 流程图 165 200 106 301
22 GGEL 凝胶色谱 653 200 80 301
23 GGEN 基因序列图 418 200 80 301
24 GHDR 手绘草图 285 200 93 301
25 GMAT 数学公式 43 200 24 218
26 GNCP 非临床照片 241 200 116 301
27 GPLI 程序列表 53 200 48 301
28 GSCR 屏幕截图 150 200 105 301
29 GSYS 系统概图 271 200 93 301
30 GTAB 表格 186 200 87 301
Quantity Distribution of Single-Label Image and Text Data in 30 Categories before and after Equalization
Data Equalization Method
Multi-Label Classification Model for Medical Literature Images
分类方法 实验模型 模型说明
未迁移 文本BERT
图像ResNet
不使用迁移手段,直接对多标签数据分类
一阶段迁移 文本BERT 使用一阶段迁移学习,载入预训练的模型权重,忽略对单标签数据的训练,直接对多标签数据分类
图像ResNet
两阶段
文本迁移
文本CNN未均衡 使用两阶段迁移学习,载入CNN或BERT预训练模型权重,训练单标签文本数据和多标签文本数据,根据单标签文本数据是否均衡化处理分为未均衡化模型及已均衡化模型
文本CNN已均衡
文本BERT未均衡
文本BERT已均衡
两阶段
图像迁移
图像ResNet
未均衡
使用两阶段迁移学习,载入ResNet预训练模型权重,训练单标签图像数据和多标签图像数据,根据单标签图像数据是否均衡化处理分为未均衡化模型及已均衡化模型
图像ResNet
已均衡
Setup of Controlled Experiment
分类方法 HLoss F 1 macro AU C micro AU C macro
未迁移 文本BERT
图像ResNet
0.032 8
0.060 3
0.041 5
0.028 4
0.920 5
0.628 2
0.726 7
0.605 1
一阶段迁移 文本BERT 0.025 1 0.291 7 0.944 1 0.841 1
图像ResNet 0.024 2 0.237 0 0.751 6 0.722 0
两阶段文本迁移 文本CNN未均衡 0.023 9 0.185 0 0.855 2 0.830 7
文本CNN已均衡 0.023 0 0.192 0 0.873 4 0.860 1
文本BERT未均衡 0.018 5 0.450 1 0.960 9 0.873 3
文本BERT已均衡 0.017 9 0.462 3 0.975 7 0.899 7
两阶段图像迁移 图像ResNet未均衡 0.016 0 0.482 0 0.764 4 0.751 6
图像ResNet已均衡 0.015 8 0.489 1 0.775 3 0.782 0
Results of Multi-Label Classification Models
[1] Weiss G M, Provost F. Learning When Training Data are Costly: The Effect of Class Distribution on Tree Induction[J]. Journal of Artificial Intelligence Research, 2003, 19(1):315-354.
doi: 10.1613/jair.1199
[2] Wang Y Q, Yao Q M, Kwok J T, et al. Generalizing from a Few Examples: A Survey on Few-Shot learning[J]. ACM Computing Surveys (CSUR), 2020, 53(3):1-34.
[3] Yang Q, Wu X D. 10 Challenging Problems in Data Mining Research[J]. International Journal of Information Technology & Decision Making, 2006, 5(4):597-604.
[4] Tsoumakas G, Katakis I. Multi-Label Classification: An Overview[J]. International Journal of Data Warehousing and Mining, 2007, 3(3):1-13.
[5] Salakhutdinov R, Tenenbaum J, Torralba A. One-Shot Learning with a Hierarchical Nonparametric Bayesian Model[C]// Proceedings of ICML Workshop on Unsupervised and Transfer Leaning. 2012:195-206.
[6] Koch G, Zemel R, Salakhutdinov R. Siamese Neural Networks for One-Shot Image Recognition[C]// Proceedings of the 32nd International Conference on Machine Learning. 2015.
[7] Vinyals O, Blundell C, Lillicrap T, et al. Matching Networks for One Shot Learning[C]// Proceedings of the 30th Conference on Neural Information Processing Systems. 2016: 3630-3638.
[8] Snell J, Swersky K, Zemel R. Prototypical Networks for Few-Shot Learning[C]// Proceedings of the 31st Conference on Neural Information Processing Systems. 2017: 4077-4087.
[9] Pan S J, Yang Q. A Survey on Transfer Learning[J]. IEEE Transactions on Knowledge & Data Engineering, 2010, 22(10):1345-1359.
[10] 庄福振, 罗平, 何清, 等. 迁移学习研究进展[J]. 软件学报, 2015, 26(1):26-39.
[10] (Zhuang Fuzhen, Luo Ping, He Qing, et al. Survey on Transfer Learning Research[J]. Journal of Software, 2015, 26(1):26-39.)
[11] Ng W W, Hu J, Yeung D S, et al. Diversified Sensitivity Based Under-Sampling for Imbalance Classification Problems[J]. IEEE Transactions on Cybernetics, 2017, 45(11):2402-2412.
doi: 10.1109/TCYB.2014.2372060
[12] 赵清华, 张艺豪, 马建芬, 等. 改进SMOTE的非平衡数据集分类算法研究[J]. 计算机工程与应用, 2018, 54(18):168-173.
[12] (Zhao Qinghua, Zhang Yihao, Ma Jianfen, et al. Research on Classification Algorithm of Imbalanced Datasets Based on Improved SMOTE[J]. Computer Engineering and Applications, 2018, 54(18):168-173.)
[13] Cervantes J, Huang D S, Farid G L, et al. A Hybrid Algorithm to Improve the Accuracy of Support Vector Machines on Skewed Data Sets[M]. Berlin, German: Springer, 2014: 782-788.
[14] 崔巍, 贾晓琳, 樊帅帅, 等. 一种新的不均衡关联分类算法[J]. 计算机科学, 2020, 47(S1):488-493.
[14] (Cui Wei, Jia Xiaolin, Fan Shuaishuai, et al. New Associative Classification Algorithm for Imbalanced Data[J]. Computer Science, 2020, 47(S1):488-493.)
[15] 龙明盛. 迁移学习问题与方法研究[D]. 北京: 清华大学, 2014.
[15] (Long Mingsheng. Transfer Learning Problems and Methods[D]. Beijing: Tsinghua University, 2014.)
[16] Tan C Q, Sun F C, Kong T, et al. A Survey on Deep Transfer Learning[C]// Proceedings of International Conference on Artificial Neural Networks. 2018: 270-279.
[17] Yosinski J, Clune J, Bengio Y, et al. How Transferable are Features in Deep Neural Networks?[C]// Proceedings of the 27th Conference on Neural Information Processing Systems. 2014: 3320-3328.
[18] Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies-Volume 1. 2019: 4171-4186.
[19] 李年华. 深度神经网络的迁移学习关键问题研究[D]. 成都: 电子科技大学, 2018.
[19] (Li Nianhua. Research on Key Problems of Transfer Learning in Deep Neural Networks[D]. Chengdu: University of Electronic Science and Technology of China, 2018.)
[20] 朱怀涛. 面向小样本的多标签分类方法与应用研究[D]. 成都:电子科技大学, 2020.
[20] (Zhu Huaitao. Multi-label Classification Method and Application Research in Small Samples[D]. Chengdu: University of Electronic Science and Technology of China, 2020.)
[21] 程磊, 吴晓富, 张索非. 数据集类别不平衡性对迁移学习的影响分析[J]. 信号处理, 2020, 36(1):110-117.
[21] (Cheng Lei, Wu Xiaofu, Zhang Suofei. Analysis of the Effect of Class Imbalance on Transfer Learning[J]. Journal of Signal Processing, 2020, 36(1):110-117.)
[22] Estabrooks A, Jo T, Japkowicz N. A Multiple Resampling Method for Learning from Imbalanced Data Sets[J]. Computational Intelligence, 2010, 20(1):18-36.
doi: 10.1111/coin.2004.20.issue-1
[23] Bunkhumpornpat C, Sinapiromsaran K, Lursinsap C. Safe-Level-Smote: Safe-Level-Synthetic Minority Over-Sampling Technique for Handling the Class Imbalanced Problem[C]// Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining. 2009: 475-482.
[24] Dong A M, Chung F L, Wang S T. Semi-Supervised Classification Method Through Oversampling and Common Hidden Space[J]. Information Sciences, 2016, 349:216-228.
[25] Konno T, Iwazume M. Pseudo-Feature Generation for Imbalanced Data Analysis in Deep Learning[OL]. arXiv Preprint, arXiv:1807.06538.
[26] Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative Adversarial Nets[C]// Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014:2672-2680.
[27] 于玉海, 林鸿飞, 孟佳娜, 等. 跨模态多标签生物医学图像分类建模识别[J]. 中国图象图形学报, 2018, 23(6):917-927.
[27] (Yu Yuhai, Lin Hongfei, Meng Jia’na, et al. Classification Modeling and Recognition for Cross Modal and Multi-Label Biomedical Image[J]. Journal of Image and Graphics, 2018, 23(6):917-927.)
[28] 李思豪, 陈福才, 黄瑞阳. 一种多标签随机均衡采样算法[J]. 计算机应用研究, 2017, 34(10):2929-2932.
[28] (Li Sihao, Chen Fucai, Huang Ruiyang. Multi-label Random Balanced Resampling Algorithm[J]. Application Research of Computers, 2017, 34(10):2929-2932.)
[29] Russakovsky O, Deng J, Su H, et al. ImageNet Large Scale Visual Recognition Challenge[J]. International Journal of Computer Vision, 2014, 115(3):211-252.
doi: 10.1007/s11263-015-0816-y
[30] Mikolov T, Grave E, Bojanowski P, et al. Advances in Pre-Training Distributed Word Representations[OL]. arXiv Preprint, arXiv:1712.09405.
[31] ImageCLEFmed (2013, 2015, 2016) Dataset[DS/OL]. [2019-06-01]. https://www.imageclef.org/.
[32] 陈健美, 宋顺林, 朱玉全, 等. 一种基于贝叶斯和神经网络的医学图像组合分类方法[J]. 计算机科学, 2008, 35(3):244-246.
[32] (Chen Jianmei, Song Shunlin, Zhu Yuquan, et al. A Method of Medical Images Combining Classification Based on Bayesian and Neural Network[J]. Computer Science, 2008, 35(3):244-246.)
[33] 孙君顶, 李琳. 基于BP神经网络的医学图像分类[J]. 计算机系统应用, 2012, 21(3):160-162.
[33] (Sun Junding, Li Lin. Classification of Medical Image Based on BP Neural Network[J]. Computer Systems & Applications, 2012, 21(3):160-162.)
[34] Lecun Y, Bengio Y, Hinton G. Deep Learning[J]. Nature, 2015, 521(7553):436-444.
doi: 10.1038/nature14539
[35] Ratliff L J, Burden S A, Sastry S S. Characterization and Computation of Local Nash Equilibria in Continuous Games[C]// Proceedings of the 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton). 2013: 917-924.
[36] He K M, Zhang X Y, Ren S Q, et al. Deep Residual Learning for Image Recognition[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016: 770-778.
[37] BERT Model Weights[EB/OL]. [2019-06-01]. https://storage.googleapis.com/bert_models/2020_02_20/uncased_L-12_H-768_ A-12.zip.
[38] ResNet Model Weights [EB/OL]. [2019-06-01]. https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5.
[39] 田敏. 基于迁移学习的医学文献内图像多标签分类[D]. 武汉: 武汉大学, 2019.
[39] (Tian Min. Image Multi-Label Classification in Medical Literature Based on Transfer Learning[D]. Wuhan: Wuhan University, 2019.)
[40] Tahir M A, Kittler J, Bouridane A. Multilabel Classification Using Heterogeneous Ensemble of Multi-Label Classifiers[J]. Pattern Recognition Letters, 2012, 33(5):513-523.
doi: 10.1016/j.patrec.2011.10.019
[41] Sun C, Qiu X P, Xu Y G, et al. How to Fine-Tune BERT for Text Classification?[C]// Proceedings of China National Conference on Chinese Computational Linguistics. 2019:194-206.
[42] Read J, Pfahringer B, Holmes G, et al. Classifier Chains for Multi-Label Classification[J]. Machine Learning, 2011, 85(3):333-359.
doi: 10.1007/s10994-011-5256-5
[1] Zhao Ping,Sun Lianying,Tu Shuai,Bian Jianling,Wan Ying. Identifying Scenic Spot Entities Based on Improved Knowledge Transfer[J]. 数据分析与知识发现, 2020, 4(5): 118-126.
[2] Liu Tong,Ni Weijian,Sun Yujian,Zeng Qingtian. Predicting Remaining Business Time with Deep Transfer Learning[J]. 数据分析与知识发现, 2020, 4(2/3): 134-142.
[3] Xiang Fei,Xie Yaotan. Recognition Model of Patient Reviews Based on Mixed Sampling and Transfer Learning[J]. 数据分析与知识发现, 2020, 4(2/3): 39-47.
[4] Wang Shuyi,Liu Sai,Ma Zheng. Microblog Image Privacy Classification with Deep Transfer Learning[J]. 数据分析与知识发现, 2020, 4(10): 80-92.
[5] Meishan Chen,Chenxi Xia. Identifying Entities of Online Questions from Cancer Patients Based on Transfer Learning[J]. 数据分析与知识发现, 2019, 3(12): 61-69.
[6] Wu Jiehua,Shen Jing,Zhou Bei. Classifying Multilayer Social Network Links Based on Transfer Component Analysis[J]. 数据分析与知识发现, 2018, 2(9): 88-99.
[7] Deng Sanhong,Fu Yuyangzi,Wang Hao. Multi-Label Classification of Chinese Books with LSTM Model[J]. 数据分析与知识发现, 2017, 1(7): 52-60.
[8] Yu Chuanming,Feng Bolin,An Lu. Sentiment Analysis in Cross-Domain Environment with Deep Representative Learning[J]. 数据分析与知识发现, 2017, 1(7): 73-81.
[9] Zhang Zhiwu. Sentiment Analysis of Product Reviews by means of Cross-domain Transfer Learning[J]. 现代图书情报技术, 2013, (6): 49-54.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn