Please wait a minute...
New Technology of Library and Information Service  2004, Vol. 20 Issue (10): 51-54    DOI: 10.11925/infotech.1003-3513.2004.10.10
Current Issue | Archive | Adv Search |
Indentifying the Topic of Web Information in Web Information Gathering
Shao Xiaoliang   Liu Hong
(The Network Center of Second Military Medical University, Shanghai 200433, China)
Download: PDF (0 KB)  
Export: BibTeX | EndNote (RIS)      

This paper introduces primarily a core work of Web topic information gathering system that we designed——identifying the topic of Web information, the algorithm begins from structuring professional topic dictionary,
analyses and considers well with the characteristics of Web page text, It increases consumedly the efficiency and accuracy of the system,this algorithm will be applicable to the other topic fields.

Key wordsWeb      Topic Infomation      Topic-Indentified      Information gather     
Received: 16 April 2004      Published: 25 October 2004


Corresponding Authors: Shao Xiaoliang     E-mail:
About author:: Shao Xiaoliang,Liu Hong

Cite this article:

Shao Xiaoliang,Liu Hong. Indentifying the Topic of Web Information in Web Information Gathering. New Technology of Library and Information Service, 2004, 20(10): 51-54.

URL:     OR

1  Andrew McCallum and Kamal Nigam: A comparison of event models for naive bayes text categorization, AAAI-98 Workshop on “Learning for Text Categorization”,1998
2  庞剑锋,卜东波,白硕.基于向量空间模型的文本自动分类系统的研究与实现.计算机应用研究,2001(9)
3  李勇,桑艳艳.网络文本数据分类技术与实现算法.情报学报,2002(1)
4  尹锋.汉语自动分词研究的现状与新思维.现代图书情报技术,1998(4)
5  梅伯平.网络信息组织的分类主题一体化研究.情报科学,2003(4)
6  冯书晓,徐新,杨春梅.国内中文分词技术研究新进展.情报杂志,2002(11)
7  牛忠兰,陈跃新,徐正同,潘鲁军.网络文本自动分类系统的研究与设计.微处理机,2001(2)
8  刁倩,王永成,张惠惠,何骥.文本自动分类中的词权重与分类算法.中文信息学报,2000(3)

[1] Zhu Fu,Yuefen Wang,Xuhui Ding. Semantic Representation of Design Process Knowledge Reuse[J]. 数据分析与知识发现, 2019, 3(6): 21-29.
[2] Guo Chonghui,Li Minqian. Evaluating Web Information for Ancient Villages Based on Rank Aggregation[J]. 数据分析与知识发现, 2018, 2(4): 10-19.
[3] Chen Yuan,Wang Chaoqun,Hu Zhongyi,Wu Jiang. Identifying Malicious Websites with PCA and Random Forest Methods[J]. 数据分析与知识发现, 2018, 2(4): 71-80.
[4] Shi Yutian,Zhu Qinghua,Zhao Yuxiang,Chen Xiaowei. Evaluating the Influence of China’s Webcast Platforms Based on Link Analysis[J]. 数据分析与知识发现, 2017, 1(9): 40-48.
[5] Hu Zhongyi,Wang Chaoqun,Wu Jiang. Identifying Phishing Websites with Multiple Online Data Sources[J]. 数据分析与知识发现, 2017, 1(6): 47-55.
[6] Li Baozhen,Wang Ya,Zhou Ke. Measuring Credibility of Social Media Contents Based on Bayesian Theory[J]. 数据分析与知识发现, 2017, 1(6): 83-92.
[7] Yin Xiangquan,Li Shuning. Analyzing Website Navigation Features of Top U.S. Academic Libraries[J]. 数据分析与知识发现, 2017, 1(3): 90-95.
[8] Wu Zhiqiang,Zhu Zhongming,Liu Wei,Zhang Wangqiang,Yao Xiaona. Retrieving 3D Models from Institutional Repository[J]. 数据分析与知识发现, 2017, 1(1): 73-80.
[9] Yang Xiaoping,Ma Qifeng,Yu Li,Mo Yuting,Wu Jia’nan,Zhang Yue. Gauging Public Opinion with Comment-Clusters[J]. 现代图书情报技术, 2016, 32(7-8): 51-59.
[10] Wu Xiaolan,Zhang Chengzhi. Analyzing Food Community with Recipes and Weibo User Reviews[J]. 现代图书情报技术, 2016, 32(6): 54-62.
[11] Xie Qi,Cui Mengtian. Group Similarity Based Hybrid Web Service Recommendation Algorithm[J]. 现代图书情报技术, 2016, 32(6): 80-87.
[12] Li Hui,Hu Yunfeng. Clustering and Discovering Web Services with Topic Model[J]. 现代图书情报技术, 2016, 32(5): 30-37.
[13] Hu Jiying,Wu Zhenxin,Xie Jing,Zhang Zhixiong. A Full-text Indexing System for WARC Files[J]. 现代图书情报技术, 2016, 32(5): 91-98.
[14] Liu Honglian,Zhang Pengyi,Wang Jun. Multi-session Product Information Seeking Behaviors, Motivation, and Influencing Factors[J]. 现代图书情报技术, 2016, 32(4): 1-7.
[15] Liu Qingxiang,Zhang Pengzhu,Zhang Xiaoyan,Liu Jingfang. Automatically Extracting Talents’ Knowledge Structure Online[J]. 现代图书情报技术, 2016, 32(4): 56-63.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938