Please wait a minute...
New Technology of Library and Information Service  2005, Vol. 21 Issue (5): 41-45    DOI: 10.11925/infotech.1003-3513.2005.05.10
Current Issue | Archive | Adv Search |
The Algorithm of Forecasting URL-Topic Based on Web Structure  and Web Page Contents
Liu Hong   Shao Xiaoliang   Hu Jibing
(The Network Information Center of  Second Military Medical University, Shanghai  200433, China)
Download: PDF (0 KB)  
Export: BibTeX | EndNote (RIS)      
Abstract  

This paper introduces primarily a core Algorithm of Web topic information gathering system that we designed——the Forecast URL-Topic Algorithm. It bases on the related theories, analyzes the experiment data and discovers the topic of the hyperlink be decided by three factors primarily: the topic Similarity of the parent Web page, the topic Similarity of the (ex-)anchor text and the structure characteristic of Web graph, then puts forward the algorithm of Forecasting URL-Topic based on Web structure and Web page contents, the system evaluation result shows that the algorithm has great efficiency.

Key wordsWeb structure      Hyperlink      Topic      Forecast      Algorithm     
Received: 31 December 2004      Published: 25 May 2005
ZTFLH: 

TP391

 
Corresponding Authors: Liu Hong     E-mail: llhhyybb@163.com
About author:: Liu Hong,Shao Xiaoliang,Hu Jibing

Cite this article:

Liu Hong,Shao Xiaoliang,Hu Jibing. The Algorithm of Forecasting URL-Topic Based on Web Structure  and Web Page Contents. New Technology of Library and Information Service, 2005, 21(5): 41-45.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2005.05.10     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2005/V21/I5/41

1Jon M. KleinbergAuthoritative Sources in a Hyperlinked EnvironmentTarjan RE, Baecker T, eds. Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms. New Orleans: ACM Press, 1997:668-677
2Andrei Broder, Ravi Kumar, Farzin Maghoul etcGraph structure in the Web: Experiments and models.9th World Wide Web Conference, 2000
3Charu C. Aggarwal, Fatima Al-Garawi and Philip S. YuIntelligent Crawling on the World Wide Web with Arbitrary Predicates".WWW10, May 2-5, 2001, Hong Kong ACM 1-58113-348-0/01/0005
4Andrei Broder, Ravi Kumar, Farzin Maghoul etcGraph structure in the Web: Experiments and models. In 9th World Wide Web Conference, 2000
5Golub GH, Van Loan CFMatrix Computations, London, Johns Hopkins University Press, 1989:40-45
6Jon Kleinberg and Steve LawrenceThe Structure of the WebS C I E N C E'S COMPA S S, www.sciencemag.org, SCIENCE VOL 294 30 NOVEMBER 2001
7李培,赵麟网上证券金融信息采集系统的研究现代图书情报技术2001(6):56-59
8李勇,桑艳艳网络文本数据分类技术与实现算法情报学报,2002(1):21-26
9李盛韬,余智华,程学旗,白硕Web信息采集研究进展计算机科学,2003(2):151-157,171
10王晓宇,周傲英万维网的链接结构分析及其应用综述软件学报,2003,14(10):1768-1780
11刘红利用扩展锚点文本来分类网页计算机应用研究,2004,21(3):112-113,124
12刘红在军训网中构建基于Web的主题信息采集系统硕士毕业论文,2004(7)

[1] Liu Qian, Li Chenliang. A Survey of Topic Evolution on Social Media[J]. 数据分析与知识发现, 2020, 4(8): 1-14.
[2] Sheng Jiaqi, Xu Xin. Expanding Scholar Labels with Research Similarity and Co-authorship Network[J]. 数据分析与知识发现, 2020, 4(8): 75-85.
[3] Wang Jiandong,Yu Shiyang. Principles on Constructing National Economic Brain[J]. 数据分析与知识发现, 2020, 4(7): 2-17.
[4] Chen Dong,Wang Jiandong,Li Huiying,Cai Sihang,Huang Qianqian,Yi Chengqi,Cao Pan. Forecasting Poultry Turnovers with Machine Learning and Multiple Factors[J]. 数据分析与知识发现, 2020, 4(7): 18-27.
[5] Yang Heng,Wang Sili,Zhu Zhongming,Liu Wei,Wang Nan. Recommending Domain Knowledge Based on Parallel Collaborative Filtering Algorithm[J]. 数据分析与知识发现, 2020, 4(6): 15-21.
[6] Yue Lixin,Liu Ziqiang,Hu Zhengyin. Evolution Analysis of Hot Topics with Trend-Prediction[J]. 数据分析与知识发现, 2020, 4(6): 22-34.
[7] Cai Yongming,Liu Lu,Wang Kewei. Identifying Key Users and Topics from Online Learning Community[J]. 数据分析与知识发现, 2020, 4(6): 69-79.
[8] Liu Ping,Peng Xiaofang. Calculating Word Similarities Based on Formal Concept Analysis[J]. 数据分析与知识发现, 2020, 4(5): 66-74.
[9] Yu Chuanming,Yuan Sai,Zhu Xingyu,Lin Hongjun,Zhang Puliang,An Lu. Research on Deep Learning Based Topic Representation of Hot Events[J]. 数据分析与知识发现, 2020, 4(4): 1-14.
[10] Pan Youneng,Ni Xiuli. Recommending Online Medical Experts with Labeled-LDA Model[J]. 数据分析与知识发现, 2020, 4(4): 34-43.
[11] Li Wenzheng,Gu Yijun,Yan Hongli. Predicting Community Numbers with Network Bayesian Information Criterion[J]. 数据分析与知识发现, 2020, 4(4): 72-82.
[12] Tang Lin,Guo Chonghui,Chen Jingfeng. Review of Chinese Word Segmentation Studies[J]. 数据分析与知识发现, 2020, 4(2/3): 1-17.
[13] Liang Yanping,An Lu,Liu Jing. Topic Resonance of Micro-blogs on Similar Public Health Emergencies[J]. 数据分析与知识发现, 2020, 4(2/3): 122-133.
[14] Liu Yuwen,Wang Kai. Finding Geographic Locations of Popular Online Topics[J]. 数据分析与知识发现, 2020, 4(2/3): 173-181.
[15] Xu Jianmin,Zhang Liqing,Wang Miao. Tracking Static Topics with Bayesian Network[J]. 数据分析与知识发现, 2020, 4(2/3): 200-206.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn