New Technology of Library and Information Service  2011, Vol. 27 Issue (6): 20-26    DOI: 10.11925/infotech.1003-3513.2011.06.04
Study on Talents Description Web Page Automatic Recognition System
Xu Jian1, Wen Haosheng2
1. School of Information Management, Sun Yat-Sen University, Guangzhou 510006, China;
2. Shenzhen Thunder Network Technology Company Ltd., Shenzhen 518057, China
Abstract  The paper brings forward a talents description Web page automatic recognition system, realizes automatic recognition methods of university talents description Web pages which are crawled by Nutch crawl system. During the automatic recognition process, features of Web page URL, title label content, anchor text content and Web page content are used.The value of those features are computed based on matching of name list, positive feature word list and negative feature word list. Based on multiple feature values, the system uses LibSVM to realize talents description Web page automatic recognition.
Key wordsLibSVM      Talents description Web page      Automatic classification      Classification feature extraction     
Received: 09 May 2011      Published: 15 August 2011



Xu Jian, Wen Haosheng. Study on Talents Description Web Page Automatic Recognition System. New Technology of Library and Information Service, 2011, 27(6): 20-26.

