Please wait a minute...
Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (12): 41-48    DOI: 10.11925/infotech.2096-3467.2017.0625
Orginal Article Current Issue | Archive | Adv Search |
Ranking Learning Method Based on Random Walk Model
He Wanying, Yang Jianlin()
School of Information Management, Nanjing University, Nanjing 210023, China
Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
Download: PDF (947 KB)   HTML ( 1
Export: BibTeX | EndNote (RIS)      
Abstract  

[Objective] This paper tries to obtain the tagging data of training corpus for supervised ranking learning tasks. [Methods] First, we proposed a ranking learning method based on the random walk model. Then, we used this method to automatically tag the training data, which also reduced the dependency of ranking on the tags. Finally, we examined our method with the OHSUMED data set. [Results] We finished the ranking learning tasks with only half of samples tagged. Compared with algorithms based on all tagged samples, performance of the proposed method was better than the RankNet algorithm but not as good as the ListNet one. [Limitations] Our method requires a random walk for each query, which is time consuming in practice. [Conclusions] The proposed method can effectively rank the learning results of training data.

Key wordsRanking Learning      Random Walk Model      Semi-supervised Learning      ListNet     
Received: 29 June 2017      Published: 29 December 2017
ZTFLH:  G350  

Cite this article:

He Wanying,Yang Jianlin. Ranking Learning Method Based on Random Walk Model. Data Analysis and Knowledge Discovery, 2017, 1(12): 41-48.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.2096-3467.2017.0625     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2017/V1/I12/41

相关性得分 查询编号 特征值1 ... 特征值44 特征值45 文档编号
0 qid:1 0.2000 ... 0.0535 0.1157 docid = 82929
1 qid:1 0.0000 ... 0.0469 0.0389 docid = 264238
2 qid:1 0.4000 ... 0.0001 0.0000 docid = 156875
0 qid:2 0.7500 ... 1.0000 1.0000 docid = 40092
... ... ... ... ... ... ...
数据集编号 p=10% p=20% p=30% p=40% p=50% p=60%
S1 0.7034±0.080 0.7490±0.0348 0.7985±0.0363 0.8638±0.0361 0.9080±0.0420 0.9408±0.0454
S2 0.6850±0.0611 0.7343±0.0353 0.7928±0.0320 0.8772±0.0355 0.9190±0.0405 0.9327±0.0479
S3 0.6436±0.0407 0.7601±0.0278 0.8272±0.0270 0.8724±0.0291 0.9026±0.0319 0.9149±0.0381
S4 0.6959±0.0425 0.7398±0.0270 0.8168±0.0264 0.8987±0.0298 0.9213±0.0358 0.9232±0.0420
S5 0.7035±0.0413 0.7469±0.0215 0.8189±0.0206 0.8503±0.0248 0.9204±0.0299 0.9216±0.0355
文件夹 训练集 验证集 测试集
Folder1 {S1, S2, S3} S4 S5
Folder2 {S2, S3, S4} S5 S1
Folder3 {S3, S4, S5} S1 S2
Folder4 {S4, S5, S1} S2 S3
Folder5 {S5, S1, S2} S3 S4
排序学习方法 Folder1 Folder2 Folder3 Folder4 Folder5 平均
RankNet 0.3159 0.4518 0.4316 0.4924 0.4597 0.4303
RWR+ListNet 0.2937 0.4364 0.4298 0.47 0.4563 0.4172
ListNet 0.3588 0.4536 0.4459 0.5065 0.4686 0.4467
[1] Tian W, Zhu F.Learning to Rank Using Semantic Features in Document Retrieval[C]//Proceedings of Wri Global Congress on Intelligent Systems. IEEE Computer Society, 2009: 500-504.
[2] Farzi S, Faili H. Improving Statistical Machine Translation Using Syntax-based Learning-to-Rank System[J]. International Journal of Electrical Power & Energy Systems, 2015, 54(11): fqv032.
doi: 10.1093/llc/fqv032
[3] Grady L.Random Walks for Image Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(11): 1768-1783.
doi: 10.1109/TPAMI.2006.233 pmid: 17063682
[4] Zhang X, He B, Luo T.Training Query Filtering for Semi-supervised Learning to Rank with Pseudo Labels[J]. World Wide Web-Internet & Web Information Systems, 2016, 19(5): 833-864.
doi: 10.1007/s11280-015-0363-z
[5] Duh K, Kirchhoff K.Learning to Rank with Partially-labeled Data[C]//Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2008: 251-258.
[6] Duh K, Kirchhoff K.Semi-supervised Ranking for Document Retrieval[J]. Computer Speech & Language, 2011, 25(2): 261-281.
doi: 10.1016/j.csl.2010.05.002
[7] Kim K H, Choi S.Incremental Learning to Rank with Partially-labeled Data[C]//Proceedings of the Workshop on Web Search Click Data. ACM, 2009: 20-27.
[8] Abdel Hady M F, Schwenker F. Combining Committee-Based Semi-Supervised Learning and Active Learning[J]. Journal of Computer Science & Technology(计算机科学技术学报(英文版)), 2010, 25(4): 681-698.
doi: 10.1007/s11390-010-1053-z
[9] Usunier N, Amini M-R, Goutte C.Multiview Semi-supervised Learning for Ranking Multilingual Documents[C]// Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases, 2011: 443-458.
[10] 何海江, 龙跃进. 适应文档检索的半监督多样本排序学习算法[J]. 计算机应用, 2011, 31(11): 3108-3111.
doi: 10.3724/SP.J.1087.2011.03108
[10] (He Haijiang, Long Yuejin.Semi-supervised Learning Listwise Ranking Functions for Document Retrival[J]. Journal of Computer Applications, 2011, 31(11): 3108-3111.)
doi: 10.3724/SP.J.1087.2011.03108
[11] 高炜, 梁立, 徐天伟, 等. 半监督k-部排序算法及在本体中的应用[J]. 中北大学学报: 自然科学版, 2013, 34(2): 140-146.
[11] (Gao Wei, Liang Li, Xu Tianwei, et al.Semi-supervised k-Partite Ranking Algorithm and Application in Ontology[J]. Journal of North University of China: Natural Science Edition, 2013, 34(2): 140-146.)
[12] 缪志高. 半监督排序学习算法研究[D]. 合肥: 中国科学技术大学, 2014.
[12] (Miao Zhigao.Research on Semi-supervised Ranking Algorithms[D]. Hefei: University of Science and Technology of China, 2014.)
[13] 张新. 半监督排序学习研究[D]. 北京: 中国科学院大学, 2015.
[13] (Zhang Xin.Research on Semi-supervised Ranking Learning[D]. Beijing: University of Chinese Academy of Sciences, 2015.)
[14] 奚凌然, 王小平. 一种结合LPA 半监督学习的排序学习算法[J]. 计算机应用与软件, 2016, 33(1): 286-290.
[14] (Xi Lingran, Wang Xiaoping.A Ranking Learning Algorithm Combing Semi-supervised Learning of LPA[J]. Computer Applications and Software, 2016, 33(1): 286-290.)
[15] Cao Y, Xu J, Liu T Y, et al.Adapting Ranking SVM to Document Retrieval[C]//Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 2006: 186-193.
[16] Cao Z, Qin T, Liu T Y, et al.Learning to Rank: From Pairwise Approach to Listwise Approach[C]// Proceedings of the 24th International Conference on Machine Learning. ACM, 2007: 129-136.
[17] Grady L.Random Walks for Image Segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(11): 1768-1783.
doi: 10.1109/TPAMI.2006.233 pmid: 17063682
[18] Tong H, Faloutsos C, Pan J Y.Fast Random Walk with Restart and Its Applications[C]//Proceedings of the International Conference on Data Mining. IEEE Computer Society, 2006: 613-622.
[19] Wang H, Huang H, Ding C.Image Annotation Using Bi-relational Graph of Images and Semantic Labels[C]// Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2011: 793-800.
[20] Szummer M, Jaakkola T.Partially Labeled Classification with Markov Random Walks[C]// Proceedings of the International Conference on Neural Information Processing Systems: Natural and Synthetic. MIT Press, 2001:945-952.
[21] 郑伟, 王朝坤, 刘璋, 等. 一种基于随机游走模型的多标签分类算法[C]. 见: NDBC 2010中国数据库学术会议集. 2010: 1418-1426.
[21] (Zheng Wei, Wang Chaokun, Liu Zhang, et al.A Multi-label Classfication Algorithm Based on Random Walk Model[C]// Proceedings of the NDBC 2010 National Database Conference. 2010:1418-1426.)
[22] Belkin M, Niyogi P, Sindhwani V.On Manifold Regularization[C]// Proceedings of the AISTATS. 2005.
[23] Qin T, Liu T Y, Xu J, et al.LETOR: A Benchmark Collection for Research on Learning to Rank for Information Retrieval[J]. Information Retrieval, 2010, 13(4): 346-374.
doi: 10.1007/s10791-009-9123-y
[24] Burges C, Shaked T, Renshaw E, et al.Learning To Rank Using Gradient Descent[C]//Proceedings of International Conference on Machine Learning. ACM, 2005:89-96.
[1] Zhang Qian, Liu Huailiang. An Algorithm of Short Text Classification Based on Semi-supervised Learning[J]. 现代图书情报技术, 2013, 29(2): 30-35.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn