Considering the problem of poor information coverage in Web data mining, this paper proposes a configurable Web crawling method for deep Web which can improve the results performance of a general search engine significantly. It classifies Web pages and manipulates key information of page content in order to make sensible queries. The experiment results also show it.
王舜燕,李蕾,吴兵华. 基于ID3分类算法的深度网络爬虫设计[J]. 现代图书情报技术, 2008, 24(6): 41-45.
Wang Shunyan,Li Lei,Wu Binghua. Design of Web Crawler for Deep Web Based on ID3 Algorithm. New Technology of Library and Information Service, 2008, 24(6): 41-45.