New Technology of Library and Information Service  2016, Vol. 32 Issue (3): 25-32    DOI: 10.11925/infotech.1003-3513.2016.03.04
Generating Hierarchical Paths of Chinese Text from Wikipedia
Xia Tian()
Key Laboratory of Data Engineering and Knowledge Engineering of Ministry of Education,Renmin University of China, Beijing 100872, China;School of Information Resource Management, Renmin University of China, Beijing 100872, China
[Objective] Generate hierarchical semantic paths of texts from Wikipedia. [Methods] We first establish article concept vector of Chinese texts from Wikipedia through explicit semantic analysis. And then, we mapped the vector to the category nodes of hierarchical-tree-like graph. Finally, we generated the hierarchical paths with the help of seed node information diffusion and top-down path selection, as well as optimization technology. [Results] The average relevance degree of the first generated hierarchical path was 54.10% on the test dataset, and the top 20 paths were sorted by relevance in the descending order. [Limitations] We did not analyze the effect of using different numbers of explicit concept vector to the quality of the generated path. [Conclusions] The hierarchical paths generated from Wikipedia can reflect the main semantic meaning of the given texts.

Key wordsSemantic path      Explicit semantic analysis      Hierarchical classification      Wikipedia     
Received: 16 November 2015      Published: 12 April 2016

Cite this article:

Xia Tian. Generating Hierarchical Paths of Chinese Text from Wikipedia. New Technology of Library and Information Service, 2016, 32(3): 25-32.

