Please wait a minute...
New Technology of Library and Information Service  2005, Vol. 21 Issue (9): 76-79    DOI: 10.11925/infotech.1003-3513.2005.09.17
Current Issue | Archive | Adv Search |
The Technology of Web Information Extraction and Its Application in the TBT Early-Warning System
Zhai Dongsheng   Yu Yang   Li Li
(The Economics and Management School, Beijing University of Technology, Beijing 100022,China)
Download:
Export: BibTeX | EndNote (RIS)      
Abstract  

This paper researches into an information technology, which could real-timely extract the interested information from data-type Web pages. The technology we employ could intelligently identify table structures, and automatically separate different kinds of data. In the process of analyzing and classifying data, it adopts the combination of sorting by words and dividing by table structure, which depends on the idea of ontology and aggregates a series of mature models, such as SVM and HMM. The technology, which has passed the test, is applied into a dynamic information gathering system of a TBT early-warning system and does a good work.

Key wordsOntology      Information extraction      TBT     
Received: 08 June 2005      Published: 25 September 2005
: 

TP274.2

 
Corresponding Authors: Yu Yang     E-mail: bgdyuyang@emails.bjut.edu.cn
About author:: Zhai Dongsheng,Yu Yang,Li Li

Cite this article:

Zhai Dongsheng,Yu Yang,Li Li. The Technology of Web Information Extraction and Its Application in the TBT Early-Warning System. New Technology of Library and Information Service, 2005, 21(9): 76-79.

URL:

https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2005.09.17     OR     https://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2005/V21/I9/76

1周明建,高济等.基于本体论的Web信息抽取.计算机辅助设计与图形学学报,2004.16(4)
2Xiaoying Gao,Mengjie Zhang and Peter Andreae. Learning Information Extraction Patterns from Tabular Web Pages without Manual Labeling. Proceedings of the IEEE/WIC International Conference on Web Intelligence (WI'03) 3Kumi ITAI, Atsuhiro TAKASU and Jun ADACHI. Information Extraction from HTML Pages and its Integration. Proceedings of the 2003 Symposium on Applications and the Internet Workshops (SAINT-w'03)
4张志刚,陈静等.一种HTML网页净化方法.情报学报,2004, 23(4):387-393
5周源远,王继成等.Web页面清洗技术的研究与实现.计算机工程,2002.28(9):48-50

[1] Tan Ying, Tang Yifei. Extracting Citation Contents with Coreference Resolution[J]. 数据分析与知识发现, 2021, 5(8): 25-33.
[2] Sheng Shu, Huang Qi, Yang Yang, Xie Qiwen, Qin Xinguo. Exchanging Chinese Medical Information Based on HL7 FHIR[J]. 数据分析与知识发现, 2021, 5(11): 13-28.
[3] Zeng Zhen,Li Gang,Mao Jin,Chen Jinghao. Data Governance and Domain Ontology of Regional Public Security[J]. 数据分析与知识发现, 2020, 4(9): 41-55.
[4] Tao Yue,Yu Li,Zhang Runjie. Active Learning Strategies for Extracting Phrase-Level Topics from Scientific Literature[J]. 数据分析与知识发现, 2020, 4(10): 134-143.
[5] Wang Yi,Shen Zhe,Yao Yifan,Cheng Ying. Domain-Specific Event Graph Construction Methods:A Review[J]. 数据分析与知识发现, 2020, 4(10): 1-13.
[6] Shaohua Qiang,Yunlu Luo,Yupeng Li,Peng Wu. Ontology Reasoning for Financial Affairs with RBR and CBR[J]. 数据分析与知识发现, 2019, 3(8): 94-104.
[7] Shiqi Deng,Liang Hong. Constructing Domain Ontology for Intelligent Applications: Case Study of Anti Tele-Fraud[J]. 数据分析与知识发现, 2019, 3(7): 73-84.
[8] Zhu Fu,Yuefen Wang,Xuhui Ding. Semantic Representation of Design Process Knowledge Reuse[J]. 数据分析与知识发现, 2019, 3(6): 21-29.
[9] Zhiqiang Liu,Yuncheng Du,Shuicai Shi. Extraction of Key Information in Web News Based on Improved Hidden Markov Model[J]. 数据分析与知识发现, 2019, 3(3): 120-128.
[10] Guangshang Gao. A Survey of User Profiles Methods[J]. 数据分析与知识发现, 2019, 3(3): 25-35.
[11] Chengzhi Zhang,Zheng Li. Extracting Sentences of Research Originality from Full Text Academic Articles[J]. 数据分析与知识发现, 2019, 3(10): 12-18.
[12] Ying Wang,Li Qian,Jing Xie,Zhijun Chang,Beibei Kong. Building Knowledge Graph with Sci-Tech Big Data[J]. 数据分析与知识发现, 2019, 3(1): 15-26.
[13] He Youshi,He Shufang. Sentiment Mining of Online Product Reviews Based on Domain Ontology[J]. 数据分析与知识发现, 2018, 2(8): 60-68.
[14] Mu Dongmei,Jin Shan,Ju Yuanhong. Finding Association Between Diseases and Genes from Literature Abstracts[J]. 数据分析与知识发现, 2018, 2(8): 98-106.
[15] Tang Huihui,Wang Hao,Zhang Zixuan,Wang Xueying. Extracting Names of Historical Events Based on Chinese Character Tags[J]. 数据分析与知识发现, 2018, 2(7): 89-100.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn