Data Analysis and Knowledge Discovery  2017, Vol. 1 Issue (12): 41-48    DOI: 10.11925/infotech.2096-3467.2017.0625
Ranking Learning Method Based on Random Walk Model
He Wanying, Yang Jianlin()
School of Information Management, Nanjing University, Nanjing 210023, China
Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing 210023, China
[Objective] This paper tries to obtain the tagging data of training corpus for supervised ranking learning tasks. [Methods] First, we proposed a ranking learning method based on the random walk model. Then, we used this method to automatically tag the training data, which also reduced the dependency of ranking on the tags. Finally, we examined our method with the OHSUMED data set. [Results] We finished the ranking learning tasks with only half of samples tagged. Compared with algorithms based on all tagged samples, performance of the proposed method was better than the RankNet algorithm but not as good as the ListNet one. [Limitations] Our method requires a random walk for each query, which is time consuming in practice. [Conclusions] The proposed method can effectively rank the learning results of training data.

Key wordsRanking Learning      Random Walk Model      Semi-supervised Learning      ListNet     
Received: 29 June 2017      Published: 29 December 2017
ZTFLH:  G350  

Cite this article:

He Wanying,Yang Jianlin. Ranking Learning Method Based on Random Walk Model. Data Analysis and Knowledge Discovery, 2017, 1(12): 41-48.

相关性得分 查询编号 特征值1 ... 特征值44 特征值45 文档编号
0 qid:1 0.2000 ... 0.0535 0.1157 docid = 82929
1 qid:1 0.0000 ... 0.0469 0.0389 docid = 264238
2 qid:1 0.4000 ... 0.0001 0.0000 docid = 156875
0 qid:2 0.7500 ... 1.0000 1.0000 docid = 40092
... ... ... ... ... ... ...
数据集编号 p=10% p=20% p=30% p=40% p=50% p=60%
S1 0.7034±0.080 0.7490±0.0348 0.7985±0.0363 0.8638±0.0361 0.9080±0.0420 0.9408±0.0454
S2 0.6850±0.0611 0.7343±0.0353 0.7928±0.0320 0.8772±0.0355 0.9190±0.0405 0.9327±0.0479
S3 0.6436±0.0407 0.7601±0.0278 0.8272±0.0270 0.8724±0.0291 0.9026±0.0319 0.9149±0.0381
S4 0.6959±0.0425 0.7398±0.0270 0.8168±0.0264 0.8987±0.0298 0.9213±0.0358 0.9232±0.0420
S5 0.7035±0.0413 0.7469±0.0215 0.8189±0.0206 0.8503±0.0248 0.9204±0.0299 0.9216±0.0355
文件夹 训练集 验证集 测试集
Folder1 {S1, S2, S3} S4 S5
Folder2 {S2, S3, S4} S5 S1
Folder3 {S3, S4, S5} S1 S2
Folder4 {S4, S5, S1} S2 S3
Folder5 {S5, S1, S2} S3 S4
排序学习方法 Folder1 Folder2 Folder3 Folder4 Folder5 平均
RankNet 0.3159 0.4518 0.4316 0.4924 0.4597 0.4303
RWR+ListNet 0.2937 0.4364 0.4298 0.47 0.4563 0.4172
ListNet 0.3588 0.4536 0.4459 0.5065 0.4686 0.4467
