Please wait a minute...
New Technology of Library and Information Service  2009, Vol. 3 Issue (2): 83-88    DOI: 10.11925/infotech.1003-3513.2009.02.14
Current Issue | Archive | Adv Search |
The Application of Regular Expressions in Online Oil Price Event
Shao Zengrong  Li Ying  Fan Tijun
(School of Business, East China University of Science and Technology, Shanghai 200237, China)
Download: PDF(775 KB)   HTML  
Export: BibTeX | EndNote (RIS)      
Abstract  

With the advantages of regular expression in string manipulation,this paper realizes extraction of oil price information from noisy and irregular Webpages. Points of importance and difficulty in realization are pointed out, and the structural description ability of regular expression in string manipulation is testified.

Key wordsRegular Expression      Webpage Data Extraction      Data Cleaning      String Processiong     
Received: 09 October 2008      Published: 25 February 2009
: 

 

 
  TP391

 
Corresponding Authors: Shao Zengrong     E-mail: shaozengrong@hotmail.com
About author:: Shao Zengrong,Li Ying,Fan Tijun

Cite this article:

Shao Zengrong,Li Ying,Fan Tijun. The Application of Regular Expressions in Online Oil Price Event. New Technology of Library and Information Service, 2009, 3(2): 83-88.

URL:

http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/10.11925/infotech.1003-3513.2009.02.14     OR     http://manu44.magtech.com.cn/Jwk_infotech_wk3/EN/Y2009/V3/I2/83

[1] Qin B, Wang S, Du X Y, et al. Graph-based Query Rewriting for Knowledge Sharing Between Peer Ontologies[J]. Information Sciences, 2008,178(18):3525-3542.
[2] Marcelo A, Leonid L. XML Data Exchange: Consistency and Query Answering[J]. Journal of the ACM, 2008,55(2):29-60.
[3] 胡立辉, 周斌, 黄园媛.基于正则式的维普网全自动包装器的实现[J]. 计算机工程与应用, 2006(31):87-89.
[4] 胡思康, 曹元大.Web网页知识获取技术[J].北京理工大学学报,2006,26(12): 1065-1068.
[5] 蒋宗礼, 姜守旭.形式语言与自动机理论[M]. 北京: 清华大学出版社, 2007: 24-62.
[6] (美)佛瑞德(Friedl J E F). 精通正则表达式(第3版)[M]. 北京: 电子工业出版社, 2007: 230-357.
[7] 王功明, 吴华瑞, 赵春江, 等. 正则表达式在电子政务客户端校验中的应用[J]. 计算机工程, 2007,35(9): 269-271.
[8] (美)宾德(Binder R V). 面向对象系统测试:模型、视图与工具[M]. 北京: 科学出版社, 2003: 478-520.
[9] 邱清盈, 郑国民, 冯培恩, 等. 基于正则表达式的专利信息提取方法研究[J]. 中国机械工程, 2007, 18(19): 2326-2329.
[10] 吴薇. 大规模短文本的分类过滤方法研究[D].  北京: 北京邮电大学, 2007.
[11] 张娜. 基于正则表达式的深度包检测研究[D]. 上海: 华东师范大学, 2007.

[1] Liu Huoyu, Wang Dongbo. Research and Implementation of Data Preprocessing Oriented to Paper Similarity Detection[J]. 现代图书情报技术, 2015, 31(5): 50-56.
[2] Zhang Hongbin, Li Guangli. Research on Sentiment Orientation Analysis of Product Online Reviews[J]. 现代图书情报技术, 2012, (10): 61-66.
[3] Li Jing,Yi Hailun,Shi Qiaomei. Design and Implementation of Electrical Journal Data Acquisition System[J]. 现代图书情报技术, 2009, 25(6): 81-84.
[4] Huang Yongwen,Li Guangjian. Review on the Application Reasearch of ETL in Digital Library[J]. 现代图书情报技术, 2007, 2(12): 1-5.
[5] Wang Yuefen,Zhang Chengzhi,Zhang Beibei,Wu Tingting. A Survey of Data Cleaning[J]. 现代图书情报技术, 2007, 2(12): 50-56.
[6] Zhang Jian,Ou Hong. Extracting the Content of Google Web Page with Regular Expressions[J]. 现代图书情报技术, 2005, 21(9): 50-53.
[7] Liu Xiaobo,Xie Qian,Li Liuying. Research of Optimized Input Checking Method Based on ASP.NET with Regular Expression[J]. 现代图书情报技术, 2005, 21(10): 80-83.
  Copyright © 2016 Data Analysis and Knowledge Discovery   Tel/Fax:(010)82626611-6626,82624938   E-mail:jishu@mail.las.ac.cn