Please wait a minute...
Advanced Search
数据分析与知识发现  2024, Vol. 8 Issue (2): 44-55     https://doi.org/10.11925/infotech.2096-3467.2022.1278
  研究论文 本期目录 | 过刊浏览 | 高级检索 |
基于层次标签结构的标记分布学习*
刘勘1(),游美琳2,卫兰茜1
1中南财经政法大学信息与安全工程学院 武汉 430073
2中国移动通信集团四川有限公司 成都 610041
Label Distribution Learning Based on Hierarchical Tag Structure
Liu Kan1(),You Meilin2,Wei Lanxi1
1School of Information Engineering, Zhongnan University of Economics and Law, Wuhan 430073, China
2China Mobile Communications Group Sichuan Co. Ltd., Chengdu 610041, China
全文: PDF (1649 KB)   HTML ( 6
输出: BibTeX | EndNote (RIS)      
摘要 

【目的】 考虑到标记分布学习中标记之间具有层次结构关系,将层次标签结构引入标记分布学习,提升标记分布学习的效果。【方法】 提出一种基于层次标签结构的标记分布学习算法(Hierarchy Label Distribution Learning Algorithm,H-LDL),根据样本在各层次的标签,利用条件概率描述粗、细两个层次的结构关系,并通过层次加权损失函数及其优化策略调节层次间标记的准确分布。【结果】 在两个公开数据集上进行实验,用了5个指标进行效果检测,其中,BU_3DFE数据集在Euclidean、Squared、K-L指标中较基线算法最低值分别降低了3.99%、1.07%、3.10%,Intersec和Fidelity指标较基线算法最高值分别提升了4.24%、0.67%,COMP数据集在Euclidean指标上降低了0.48%,在Squared、K-L指标未见明显降低,在Intersec和Fidelity指标上提升了0.45%、0.02%。【局限】 仅考虑了标签之间粗层次和细层次两层结构关系,当标签具有其他更复杂的层次结构关系时需进一步研究。【结论】 加入层次标签结构后标记分布误差有明显减小,有效提升了标记分布学习的效果。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
刘勘
游美琳
卫兰茜
关键词 层次结构标记分布学习层次标签条件概率    
Abstract

[Objective] This paper focuses on the complex hierarchical relationship between tokens in label distribution learning. It enhance performance by adding the hierarchical tag structure to the label distribution learning model.[Methods] We proposed a hierarchy-based label distribution learning algorithm (H-LDL), which used conditional probability to describe the extensive and intensive tag structural relationship. We also adjusted the exact distribution of each level by the function of hierarchical weighted loss and its optimization strategy. [Results] We examined the new model on two public datasets. The Euclidean, Squared, and K-L scores decreased by 3.99%, 1.07%, and 3.10% on BU_3DFE dataset compared to the baseline model, while Intersec and Fidelity improved by 4.24% and 0.67%. On COMP dataset, the Euclidean decreased by 0.48%, but the Squared and K-L showed no significant decrease, while Intersect and Fidelity metrics increased by 0.45% and 0.02%. [Limitations] We only included two hierarchical relationships in the new model. Further research is needed for more complex hierarchical relationships. [Conclusions] A hierarchical label structure effectively improves the performance of label distribution learning.

Key wordsHierarchical Structure    Label Distribution Learning    Hierarchical Tag    Conditional Probability
收稿日期: 2022-12-02      出版日期: 2023-03-30
ZTFLH:  TP391  
  G35  
基金资助:*国家自然科学基金项目(72174156)
通讯作者: 刘勘,ORCID:0000-0001-9339-7315,E-mail: liukan@zuel.edu.cn。   
引用本文:   
刘勘, 游美琳, 卫兰茜. 基于层次标签结构的标记分布学习*[J]. 数据分析与知识发现, 2024, 8(2): 44-55.
Liu Kan, You Meilin, Wei Lanxi. Label Distribution Learning Based on Hierarchical Tag Structure. Data Analysis and Knowledge Discovery, 2024, 8(2): 44-55.
链接本文:  
https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/10.11925/infotech.2096-3467.2022.1278      或      https://manu44.magtech.com.cn/Jwk_infotech_wk3/CN/Y2024/V8/I2/44
Fig.1  单标记、多标记和标记分布示例
Fig.2  基于层次结构的标记分布学习流程
Fig.3  基于层次结构的标记分布学习算法网络架构
指标 公式
Euclidean↓ d ( D , D * ) = i = 1 c ( d i - d i * ) 2
Squared↓ d ( D , D * ) = i = 1 c ( d i - d i * ) 2
K-L↓ d ( D , D * ) = i = 1 c d i l n d i d i *
Intersec↑ d ( D , D * ) = i = 1 c m i n ( d i , d i * )
Fidelity↑ d ( D , D * ) = i = 1 c d i d i *
Table 1  标记分布学习算法评估指标
Fig.4  电影评论层次标签数据示例
数据集 样本数 特征数 粗层次标签数 细层次标签数
BU_3DFE 2 500 243 3 6
COPM 7 755 1 869 3 5
Table 2  BU_3DFE和COPM数据集详情
实验编号 迭代轮次 粗层次权重 w 1 细层次权重 w 2 学习率
1 0<epoch≤150 0.90 0.10 0.020 0
150<epoch≤350 0.50 0.50 0.002 0
350<epoch≤500 0.10 0.90 0.000 2
2 0<epoch≤150 0.60 0.40 0.050 0
150<epoch≤350 0.50 0.50 0.025 0
350<epoch≤500 0.20 0.80 0.002 0
3 0<epoch≤150 0.65 0.35 0.005 0
150<epoch≤350 0.50 0.50 0.002 0
350<epoch≤500 0.35 0.65 0.000 5
4 0<epoch≤150 0.75 0.25 0.005 0
150<epoch≤350 0.50 0.50 0.002 0
350<epoch≤500 0.25 0.75 0.000 5
Table 3  4组实验参数情况
迭代轮次 粗层次权重 w 1 细层次权重 w 2 学习率
0<epoch≤150 0.75 0.25 0.005 0
150<epoch≤350 0.50 0.50 0.002 5
350<epoch≤500 0.20 0.80 0.000 5
Table 4  训练中的权重设置
层次 Euclidean↓ Squared↓ K-L↓ Intersec↑ Fidelity↑
粗层次 0.121 3 0.023 5 0.050 6 0.907 0 0.989 7
细层次 0.129 7 0.022 6 0.055 8 0.877 5 0.986 1
Table 5  BU_3DFE数据集实验结果
层次 Euclidean↓ Squared↓ K-L↓ Intersec↑ Fidelity↑
粗层次 0.206 1 0.062 5 0.096 1 0.838 1 0.975 8
细层次 0.169 5 0.039 8 0.109 2 0.833 4 0.972 7
Table 6  COPM数据集实验结果
算法 Euclidean↓ Squared↓ K-L↓ Intersec↑ Fidelity↑
AA-BP 0.169 6 0.033 3 0.087 5 0.835 1 0.979 1
AA-KNN 0.177 9 0.040 0 0.099 4 0.833 5 0.975 5
CPNN 0.171 1 0.034 9 0.087 5 0.833 9 0.979 4
IIS-LLD 0.170 4 0.034 6 0.086 8 0.833 9 0.979 4
PT-SVM 0.180 6 0.037 0 0.093 5 0.822 0 0.977 7
H-LDL 0.129 7 0.022 6 0.055 8 0.877 5 0.986 1
Table 7  BU_3DFE数据集基线对比
算法 Euclidean↓ Squared↓ K-L↓ Intersec↑ Fidelity↑
AA-BP 0.174 3 0.038 5 0.107 1 0.828 9 0.972 5
AA-KNN 0.180 0 0.042 2 0.115 1 0.822 7 0.970 3
CPNN 0.203 8 0.053 6 0.147 1 0.802 2 0.962 9
IIS-LDL 0.207 6 0.052 6 0.132 2 0.802 5 0.966 2
PT-SVM 0.375 0 0.146 3 0.380 9 0.630 9 0.899 8
H-LDL 0.169 5 0.039 8 0.109 2 0.833 4 0.972 7
Table 8  COPM数据集基线对比
Fig.5  BU_3DFE数据集上层次结构转换层作用对比
Fig.6  COPM数据集上层次结构转换层作用对比
[1] 李志欣, 卓亚琦, 张灿龙, 等. 多标记学习研究综述[J]. 计算机应用研究, 2014, 31(6): 1601-1605.
[1] (Li Zhixin, Zhuo Yaqi, Zhang Canlong, et al. Survey on Multi-label Learning[J]. Application Research of Computers, 2014, 31(6): 1601-1605.)
[2] 朱越, 姜远, 周志华. 一种基于多示例多标记学习的新标记学习方法[J]. 中国科学:信息科学, 2018, 48(12): 1670-1680.
[2] (Zhu Yue, Jiang Yuan, Zhou Zhihua. Multi-Instance Multi-Label New Label Learning[J]. Scientia Sinica(Informationis), 2018, 48(12): 1670-1680.)
[3] 王艳茹, 马慧芳, 刘海姣, 等. 基于多标签语义关联关系的微博用户兴趣建模方法[J]. 计算机工程与科学, 2018, 40(11): 2067-2073.
[3] (Wang Yanru, Ma Huifang, Liu Haijiao, et al. A Microblog User Interest Modeling Method Based on Multi-Tag Semantic Correlation[J]. Computer Engineering & Science, 2018, 40(11): 2067-2073.)
[4] 耿新, 徐宁, 邵瑞枫. 面向标记分布学习的标记增强[J]. 计算机研究与发展, 2017, 54(6): 1171-1184.
[4] (Geng Xin, Xu Ning, Shao Ruifeng. Label Enhancement for Label Distribution Learning[J]. Journal of Computer Research and Development, 2017, 54(6): 1171-1184.)
[5] Ren Z C, Peetz M H, Liang S S, et al. Hierarchical Multi-Label Classification of Social Text Streams[C]// Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. 2014: 213-222.
[6] Yan Z C, Zhang H, Piramuthu R, et al. HD-CNN: Hierarchical Deep Convolutional Neural Networks for Large Scale Visual Recognition[C]// Proceedings of 2015 IEEE International Conference on Computer Vision. 2015: 2740-2748.
[7] 王卫军, 宁致远, 杜一, 等. 基于多标签分类的科技文献学科交叉研究性质识别[J]. 数据分析与知识发现, 2023, 7(1): 102-112.
[7] (Wang Weijun, Ning Zhiyuan, Du Yi, et al. Identifying Interdisciplinary Sci-Tech Literature Based on Multi-Label Classification[J]. Data Analysis and Knowledge Discovery, 2023, 7(1): 102-112.)
[8] 刘海峰, 刘守生, 张学仁, 等. 一种基于类别信息的文本自动分类模型[J]. 现代图书情报技术, 2010(4): 72-76.
[8] (Liu Haifeng, Liu Shousheng, Zhang Xueren, et al. A Model of Text Categorization Automatically Based on Category[J]. New Technology of Library and Information Service, 2010(4): 72-76.)
[9] 张华鑫, 庞建刚. 基于SVM和KNN的文本分类研究[J]. 现代情报, 2015, 35(5): 73-77.
doi: 10.3969/j.issn.1008-0821.2015.05.014
[9] (Zhang Huaxin, Pang Jiangang. Research on Text Classification Based on SVM and KNN[J]. Journal of Modern Information, 2015, 35(5): 73-77.)
doi: 10.3969/j.issn.1008-0821.2015.05.014
[10] 郑伟, 王朝坤, 刘璋, 等. 一种基于随机游走模型的多标签分类算法[J]. 计算机学报, 2010, 33(8): 1418-1426.
doi: 10.3724/SP.J.1016.2010.01418
[10] (Zheng Wei, Wang Chaokun, Liu Zhang, et al. A Multi-Label Classification Algorithm Based on Random Walk Model[J]. Chinese Journal of Computers, 2010, 33(8): 1418-1426.)
doi: 10.3724/SP.J.1016.2010.01418
[11] Wu B Y, Jia F, Liu W, et al. Multi-Label Learning with Missing Labels Using Mixed Dependency Graphs[J]. International Journal of Computer Vision, 2018, 126(8): 875-896.
doi: 10.1007/s11263-018-1085-3
[12] Geng X, Smith-Miles K A, Zhou Z H. Facial Age Estimation by Learning from Label Distributions[C]// Proceedings of 24th AAAI Conference on Artificial Intelligence. 2010: 451-456.
[13] Geng X, Ji R Z. Label Distribution Learning[C]// Proceedings of IEEE 13th International Conference on Data Mining. 2013. DOI:10.1109/ICDMW.2013.19.
[14] 耿新, 徐宁. 标记分布学习与标记增强[J]. 中国科学:信息科学, 2018, 48(5): 521-530.
doi: 10.1360/N112018-00029
[14] (Geng Xin, Xu Ning. Label Distribution Learning and Label Enhancement[J]. Scientia Sinica Informationis, 2018, 48(5): 521-530.)
doi: 10.1360/N112018-00029
[15] 邵东恒, 杨文元, 赵红. 应用k-means算法实现标记分布学习[J]. 智能系统学报, 2017, 12(3): 325-332.
[15] (Shao Dongheng, Yang Wenyuan, Zhao Hong. Label Distribution Learning Based on k-means Algorithm[J]. CAAI Transactions on Intelligent Systems, 2017, 12(3): 325-332.)
[16] 李婵, 杨文元, 赵红. 基于最小二乘法的标记分布学习[J]. 郑州大学学报(理学版), 2017, 49(4): 22-27.
[16] Li Chan, Yang Wenyuan, Zhao Hong. Label Distribution Learning Based on Least Square Method[J]. Journal of Zhengzhou University (Natural Science Edition), 2017, 49(4): 22-27.)
[17] Xing C, Geng X, Xue H. Logistic Boosting Regression for Label Distribution Learning[C]// Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. 2016: 4489-4497.
[18] Gao B B, Xing C, Xie C W, et al. Deep Label Distribution Learning with Label Ambiguity[J]. IEEE Transactions on Image Processing, 2017, 26(6): 2825-2838.
doi: 10.1109/TIP.2017.2689998
[19] 王一宾, 李田力, 程玉胜. 结合谱聚类的标记分布学习[J]. 智能系统学报, 2019, 14(5): 966-973.
[19] (Wang Yibin, Li Tianli, Cheng Yusheng. Label Distribution Learning Based on Spectral Clustering[J]. CAAI Transactions on Intelligent Systems, 2019, 14(5): 966-973.)
[20] Zhai Y S, Dai J H, Shi H. Label Distribution Learning Based on Ensemble Neural Networks[C]// Proceedings of the 25th International Conference on Neural Information Processing. 2018: 593-602.
[21] 赵权, 耿新. 标记分布学习中目标函数的选择[J]. 计算机科学与探索, 2017, 11(5): 708-719.
doi: 10.3778/j.issn.1673-9418.1603051
[21] (Zhao Quan, Geng Xin. Selection of Target Function in Label Distribution Learning[J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(5): 708-719.)
doi: 10.3778/j.issn.1673-9418.1603051
[22] Zhao W, Wang H. Strategic Decision-Making Learning from Label Distributions: An Approach for Facial Age Estimation[J]. Sensors, 2016, 16(7): 994.
doi: 10.3390/s16070994
[23] Geng X, Yin C, Zhou Z H. Facial Age Estimation by Learning from Label Distributions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(10): 2401-2412.
doi: 10.1109/TPAMI.2013.51 pmid: 23969385
[24] 曾雪强, 罗明珠, 陈素芬, 等. 基于自适应多重多元回归的人脸年龄估计[J]. 江西师范大学学报(自然科学版), 2019, 43(1): 68-75.
[24] Zeng Xueqiang, Luo Mingzhu, Chen Sufen, et al. The Facial Age Estimation Based on Adaptive Multivariate Multiple Regression[J]. Journal of Jiangxi Normal University (Natural Science Edition), 2019, 43(1): 68-75.)
[25] 向润, 陈素芬, 曾雪强. 基于多重多元回归的人脸年龄估计[J]. 山东大学学报(工学版), 2019, 49(2): 54-60.
[25] (Xiang Run, Chen Sufen, Zeng Xueqiang. Facial Age Estimation Based on Multivariate Multiple Regression[J]. Journal of Shandong University (Engineering Science), 2019, 49(2): 54-60.)
[26] Geng X, Wang Q, Xia Y. Facial Age Estimation by Adaptive Label Distribution Learning[C]// Proceedings of the 22nd International Conference on Pattern Recognition. 2014: 4465-4470.
[27] Liu A N, Shi Y D, Jing P G, et al. Structured Low-Rank Inverse-Covariance Estimation for Visual Sentiment Distribution Prediction[J]. Signal Processing, 2018, 152: 206-216.
doi: 10.1016/j.sigpro.2018.06.001
[28] Geng X, Hou P. Pre-release Prediction of Crowd Opinion on Movies by Label Distribution Learning[C]// Proceedings of the 24th International Joint Conference on Artificial Intelligence. 2015: 3511-3517.
[29] Geng X, Xia Y. Head Pose Estimation Based on Multivariate Label Distribution[C]// Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition. 2014: 1837-1842.
[30] Sang G L, Chen H, Huang G, et al. Unseen Head Pose Prediction Using Dense Multivariate Label Distribution[J]. Frontiers of Information Technology & Electronic Engineering, 2016, 17(6): 516-526.
[31] Ling M G, Geng X. Indoor Crowd Counting by Mixture of Gaussians Label Distribution Learning[J]. IEEE Transactions on Image Processing, 2019, 28(11): 5691-5701.
doi: 10.1109/TIP.2019.2922818 pmid: 31226075
[32] Zhang Z X, Wang M, Geng X. Crowd Counting in Public Video Surveillance by Label Distribution Learning[J]. Neurocomputing, 2015, 166: 151-163.
doi: 10.1016/j.neucom.2015.03.083
[33] Liao L F, Zhang X, Zhao F Q. Multi-Branch Deformable Convolutional Neural Network with Label Distribution Learning for Fetal Brain Age Prediction[C]// Proceedings of 2020 IEEE 17th International Symposium on Biomedical Imaging. 2020: 424-427.
[34] 景一真, 赵耀帅, 傅之凤, 等. 一种延误航班旅客动态成行需求量预测算法[J]. 计算机仿真, 2022, 39(6): 26-30.
[34] (Jing Yizhen, Zhao Yaoshuai, Fu Zhifeng, et al. A Prediction Algorithm of Passenger Dynamic Demand for Delayed Flights[J]. Computer Simulation, 2022, 39(6): 26-30.)
[35] 刘睿馨, 刘新媛, 李晨. 基于低秩表示的标记分布学习算法[J]. 模式识别与人工智能, 2021, 34(2): 146-156.
doi: 10.16451/j.cnki.issn1003-6059.202102006
[35] (Liu Ruixin, Liu Xinyuan, Li Chen. Label Distribution Learning Method Based on Low-Rank Representation[J]. Pattern Recognition and Artificial Intelligence, 2021, 34(2): 146-156.)
doi: 10.16451/j.cnki.issn1003-6059.202102006
[36] 李睿钰, 祝继华, 刘新媛. 考虑标记间协作的标记分布学习[J]. 软件学报, 2022, 33(2): 539-554.
[36] (Li Ruiyu, Zhu Jihua, Liu Xinyuan. Label Distribution Learning with Collaboration among Labels[J]. Journal of Software, 2022, 33(2): 539-554.)
[37] Jia X Y, Li Z C, Zheng X, et al. Label Distribution Learning with Label Correlations on Local Samples[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 33(4): 1619-1631.
doi: 10.1109/TKDE.69
[38] Qiu Z Y, Hu M J, Zhao H. Hierarchical Classification Based on Coarse- to Fine-Grained KNOWLEDGE Transfer[J]. International Journal of Approximate Reasoning, 2022, 149: 61-69.
doi: 10.1016/j.ijar.2022.07.002
[39] Huang W, Chen E H, Liu Q, et al. Hierarchical Multi-Label Text Classification: An Attention-Based Recurrent Network Approach[C]// Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2019: 1051-1060.
[40] Li Y K, Zhang M L, Geng X. Leveraging Implicit Relative Labeling-Importance Information for Effective Multi-Label Learning[C]// Proceedings of 2015 IEEE International Conference on Data Mining. 2015: 251-260.
[41] Yin L J, Wei X Z, Sun Y, et al. A 3D Facial Expression Database for Facial Behavior Research[C]// Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition. 2006: 211-216.
[1] 曾镇, 吕学强, 李卓. 搜索日志中中文人名的自动识别[J]. 现代图书情报技术, 2014, 30(12): 71-77.
[2] 李纲, 王忠义. 基于语义的情感挖掘系统的设计与实现[J]. 现代图书情报技术, 2011, 27(7/8): 97-103.
[3] 谭金波 . 基于本体实现网页规则分类的方法[J]. 现代图书情报技术, 2007, 2(3): 39-42.
[4] 于丹辉,朱亚玲. 网格资源共享服务的层次结构模型研究[J]. 现代图书情报技术, 2007, 2(12): 20-24.
[5] 朱伟丽,韩宇,肖晓旦,陈先来 . 医学关键词与叙词对照表自动构建研究[J]. 现代图书情报技术, 2006, 1(8): 51-54.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 2015 《数据分析与知识发现》编辑部
地址:北京市海淀区中关村北四环西路33号 邮编:100190
电话/传真:(010)82626611-6626,82624938
E-mail:jishu@mail.las.ac.cn